Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: how to speed up comparison between two files

by wee (Scribe)
on Dec 10, 2014 at 22:34 UTC ( [id://1109969]=note: print w/replies, xml ) Need Help??


in reply to how to speed up comparison between two files

I started to come up with a solution much like BrowserUK's, though not quite as elegant, and I decided to move on since his works well enough.

In order to see what was going on, I wanted to run your code. In doing so I fixed all the scoping errors (the 'strict' pragma should have been made the default behavior in like 1996), corrected the whitespace issues, removed unnecessary comments, and closed your filehandles after they were done being used. In case anyone else wants to copy/paste a working version of the original code (which uses Inline::Files), here it is:

use warnings; use strict; use Inline::Files; #open(AB, "try_fimo.txt") || die("cannot open"); #my @data = <AB>; #close(AB); #chomp(@data); # #open(BC, "try_fimo2.txt") || die("cannot open"); #my @data2 = <BC>; #close(BC); #chomp(@data2); my @data = <AB>; my @data2 = <BC>; my ($t1, $t2, $t3); my (@tf1, @seq1, @dis1); my (@tf2, @seq2, @dis2); my (@tf3, @seq3, @dis3); foreach my $line (@data) { foreach my $line2 (@data2) { if ($line2 =~ /(.*?)\s+(.*?)\s+(.*)/) { $t1 = $1; # eg. in first row from file2 i.e. ABC, it will first +take A followed by B & C $t2 = $2; $t3 = $3; } if ($line =~ /(.*?)\s+(.*?)\s+(.*)/) { if ($1 eq $t1) { push(@tf1, $1); push(@seq1, $2); push(@dis1, $3); # print $1,"\t",$2,"\t",$3,"\t"; } elsif ($1 eq $t2) { push(@tf2, $1); push(@seq2, $2); push(@dis2, $3); } elsif ($1 eq $t3) { push(@tf3, $1); push(@seq3, $2); push(@dis3, $3); } } } } for (my $i = 0; $i < @tf1; $i++) { for (my $j = 0; $j < @tf2; $j++) { for (my $k = 0; $k < @tf3; $k++) { if (($seq1[$i] eq $seq2[$j]) && ($seq1[$i] eq $seq3[$k])) { if (($tf1[$i] ne $tf2[$j]) && ($tf1[$i] ne $tf3[$k])) { print $tf1[$i], "\t", $seq1[$i], "\t", $dis1[$i], "\t", $tf2[$j], "\t", $seq2[$j], "\t", $dis2[$j], "\t", $tf3[$k], "\t", $seq3[$k], "\t", $dis3[$k], "\n"; } } } } } __AB__ A seq1 20 B seq2 25 B seq2 80 B seq1 40 C seq1 25 D seq2 30 E seq2 45 __BC__ A B C

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1109969]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-18 03:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found