Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^2: Searching and Coutning using 2 files with multiple columns

by shart3 (Novice)
on Sep 16, 2009 at 16:38 UTC ( [id://795661]=note: print w/replies, xml ) Need Help??


in reply to Re: Searching and Coutning using 2 files with multiple columns
in thread Searching and Coutning using 2 files with multiple columns

Changing the Boundary.out file to:

chr1 3204563 3661775 - chr1 3204563 3600000 - chr1 3204500 3660000 - chr1 3204000 3204001 - chr1 3204563 3760000 -

better illustrates the point. There are 3 instances where I should get a positive result for the first line in DB.out. However, rather than counting the number of times a hit occurs, the line from DB.out is printed 3 times (each with the count value of "1"):

Xkr4 chr1 3204562 3661779 - 3 457.217 1 Xkr4 chr1 3204562 3661779 - 3 457.217 1 Xkr4 chr1 3204562 3661779 - 3 457.217 1

I also added a step to reset the count to zero for each loop leaving:

my @file1 = (); open(FILE, @ARGV[0]) || die ("could not open file @ARGV[0]\n"); while (my $line = <FILE>) { chomp $line; my ($chr, $start, $stop) = split(/\t/, $line); push @file1, [$chr, $start, $stop]; } close FILE; open(FILE, @ARGV[1])||die ("could not open file @ARGV[1]\n"); while(<FILE>){ ($Gene,$Chrom,$ModStart,$ModEnd,$Strand,$ExonCount,$SizeKB)= s +plit; foreach my $line (@file1){ $Count = 0; my ($chr, $start, $stop) = @$line; if ($chr eq $Chrom && $start gt $ModStart && $end lt $ModE +nd){ $Count++; print ("$Gene\t$Chrom\t$ModStart\t$ModEnd\t$Strand\t$ExonC +ount\t$SizeKB\t$Count\n") } } }

How do I get it to change the number to 3, and not print 3 times?

Replies are listed 'Best First'.
Re^3: Searching and Coutning using 2 files with multiple columns
by kennethk (Abbot) on Sep 16, 2009 at 18:10 UTC
    If you want to aggregate results, you need to specify what you are aggregating by. What are you trying to count? Based on what you've written, I will guess you want to know the number of lines of Boundary.out that match each line of DB.out. You can accomplish this by just moving your counter and print statements outside of the foreach loop:

    my $Count = 0; foreach my $line (@file1){ my ($chr, $start, $stop) = @$line; if ($chr eq $Chrom && $start gt $ModStart && $end lt $ModE +nd){ $Count++; } } print ("$Gene\t$Chrom\t$ModStart\t$ModEnd\t$Strand\t$ExonCount +\t$SizeKB\t$Count\n");

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://795661]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (2)
As of 2024-04-20 12:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found