Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

At each change in...

by malaga (Pilgrim)
on Jun 13, 2002 at 09:20 UTC ( [id://174121] : perlquestion . print w/replies, xml ) Need Help??

malaga has asked for the wisdom of the Perl Monks concerning the following question:

Is there any way of saying "at each change in"? i am reading down a column in a text file, and i want to do something at each change. so if i have
apples apples apples oranges
i want to "do something" on the first apple, and on the first orange. do i need to use a counter, or say "if $foo ne $previousfoo", or is there any easier way? the search didn't come up with anything. thanks.

Replies are listed 'Best First'.
Re: At each change in...
by stefp (Vicar) on Jun 13, 2002 at 09:33 UTC
    If you want to detect change from one line to another, indeed something like if $foo ne $previousfoo will do.

    But detecting the first occurrence of a chain is quite another problem that can be solved by memorizing the number of occurences as value associated to the string:

    perl -ne 'chomp; print "new $_\n" unless $a{$_}++' nm_of_processed_fi +le
    The loop may have to be explicit depending on the source of your data.

    -- stefp -- check out TeXmacs wiki

Re: At each change in...
by Abigail-II (Bishop) on Jun 13, 2002 at 09:55 UTC
    Well, if you are on a Unix system, make use of its toolkit. One of the tools is called sort and usually has options to "merge, do not sort" and "uniques only". On my system, you these options are enable with -mu, but check your manual page. Then, instead of opening your file with:
    open my $fh => "file" or die $!;
    open it with:
    open my $fh => "sort -mu file |" or die $!;

    No counter needed, no check, no keeping track of the previous. Ultimate code reuse.


      For me, the whole point of perl is to be able to forget this Unix stuff by getting inside one hood. Perl is the (almost) the same on every platform. And the effort to learn perl is offset by being able to forget this myrad of Unix commands with myriad of options that are supported or not on a given platform. Also there is often overhead avoided by perl. In the case at hand, we have to fork and pipe (no big deal) and sort (this last can cost if the processed file is very big)

      But again, TMTOWTDI.

      -- stefp -- check out TeXmacs wiki

        Is having to learn a myriad of Unix command worse than having to learn a myriad of APIs of a myriad of Perl modules? Or do you plan to do everything yourself? I rather reuse what others have done....

        Note also that IEEE Std 1003.1 - 2001 (aka POSIX) "Shell and Utilities" require the -m and -u options.

        As for the overhead of sort, in the given example, no sort is actually done. The -m option, for merge, merges files which are already presumed to be sorted - but when given one file, this passes things unsorted.


      Why not just use the program uniq? It would appear to perform the same function as you desire, and is perhaps more recognizable when reading.
        uniq works as well; my point was more about using the toolkit available then sort vs uniq.


Re: At each change in...
by marvell (Pilgrim) on Jun 13, 2002 at 09:36 UTC

    Assuming you don't want to write the three lines which compare the present value to some old value, you could use:

    while(<DATA>) { chomp; print "pleasure - $_\n" if ($_ ne $old) && ($old = $_); } __DATA__ apples apples apples oranges pears bread bread

    Steve Marvell

Re: At each change in...
by broquaint (Abbot) on Jun 13, 2002 at 09:49 UTC
    do i need to use a counter, or say "if $foo ne $previousfoo", or is there any easier way?
    You don't necessarily need to do that - you are using perl after all ...
    open(my $fh, "somefile.txt") or die("ack - $!"); my @lines = <$fh>; # let's hope this isn't a large file ... for (0 .. $#lines) { print "a change - $lines[$_]" if $lines[$_] ne $lines[$_ - 1]; }
    However it's probably easier just to keep your state in a variable as it's straight forward and doesn't require the slurping of the whole file.


Re: At each change in...
by malaga (Pilgrim) on Jun 13, 2002 at 09:52 UTC
    very helpful, thanks!
Re: At each change in...
by malaga (Pilgrim) on Jun 13, 2002 at 11:13 UTC
    THAT'S what i was hoping for :) thanks.