Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

This is even less memory efficient, but I couldn't resist turning your problem into a golfed one-liner. I'm sure someone else will squeeze a few extra characters out of it:

perl -ane "push@{$h{$F[0]}},$_;END{while(($k,$v)=each%h){print@{$v}if@ +{$v}>1}}" dat1.txt dat2.txt

-a = autosplit into @F. -n means wrap the -e code in a while(<>){.....} loop. So as this one-liner iterates over the two (or more) files, it pushes each line into an anonymous array held in a hash where the keys are the "objectN" (the first element of @F).

After the first implicit while. loop (-n) finishes, the END{} block is executed. Here we test each hash element to see if its anonymous array holds more than one element. If it does, print the array. We're taking advantage of the fact that each array element still contains the \n newline from the original file's line endings, and that's why "print @ARRAY." results in one element per print-out line.

I hope my description of this solution helps, but you can also brush up on perlrun for more details. There are a couple of caveats with this one-liner. First, both files are slurped into a hash in their entirety. Second, the output is in no particular order.


Dave


In reply to Re: comparing two files for duplicate entries by davido
in thread comparing two files for duplicate entries by Angharad

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-24 15:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found