Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

state preserving uniq

by Chady (Priest)
on Mar 01, 2003 at 08:50 UTC ( [id://239674]=CUFP: print w/replies, xml ) Need Help??

removes duplicate lines like UNIX uniq without having to sort the lines first, and preserves the line positions keeping the first match in the list.

#!/usr/bin/perl while (<>) { next if (defined $list{$_}); $list{lc($_)} = [ $i++, $_ ]; } print map { $_->[1] } sort { $a->[0] <=> $b->[0] } values %list;

Replies are listed 'Best First'.
Re: state preserving uniq
by xmath (Hermit) on Mar 01, 2003 at 10:54 UTC
    Don't mean to be rude, but this can be done a lot simpler:
    perl -ne '$s{lc $_}++ or print'
    (It's actually a standard example to which I added lc, and unlike your version it prints out the line right at its first occurrance, rather than first slurping in the whole file)
      Or even:
      $s{+lc}++||print
      :-)

      Makeshifts last the longest.

        Yea ofcourse; I was aiming for simplicity though, I wasn't golfing. :-)

        (it yields the same op-tree btw)

      yea, but my code use a lot more memory than yours. ;)

      typical example on how NOT to code in perl. (I should get more sleep)

      nice one-liner
      He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

      Chady | http://chady.net/
      And that can be sped up even more...
      perl -ne 'print if 1 == ++$s{lc $_}'
      (The difference is that you don't have to create temporary variables.)
        Although I can't test it right now, I sincerely doubt that your version is faster, especially if you compare the optrees:

        While your version does save copying the old integer value to a temporary (you realize the temporary sv is allocated at compile-time, right?), it takes two extra ops, which I'm pretty sure costs more cpu time.

        And in any case the difference is too minimal (especially compared to lowercasing the string and doing a hash lookup) to justify adding to the code complexity

Re: state preserving uniq
by steves (Curate) on Mar 01, 2003 at 12:13 UTC

    Why the lc? uniq as I know it is not case insensitive. You might want to note that behavior to any potential users.

Re: state preserving uniq
by Intrepid (Deacon) on Jul 30, 2003 at 19:02 UTC

    This discussion also took place over here where a version is presented that takes its input on STDIN (like a typical *nix filter) as well as in the arguments in @ARGV.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://239674]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2024-04-18 06:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found