Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: To count letters (%identity) in DNA alignment

by bobf (Monsignor)
on Jan 27, 2009 at 03:32 UTC ( [id://739096]=note: print w/replies, xml ) Need Help??


in reply to To count letters (%identity) in DNA alignment

I've been known to reinvent wheels from time to time, but this particular wheel has been around the block a few times...

CPAN. Specifically Bio::Matrix::PSM::IO::masta. Per the docs, this module will

convert a set of aligned sequences:
ACATGCAT ACAGGGAT ACAGGCAT ACCGGCAT
to a PFM (SiteMatrix object)

Given the number of PFM-related modules within Bioperl I would imagine anything you want to do is already written.

Unfortunately I don't have time to provide an example. However, I would be very interested to see what you come up with if you go this route. Please post your solution when you are done so others can learn from your experience.

HTH

Replies are listed 'Best First'.
Re^2: To count letters (%identity) in DNA alignment
by sedm1000 (Initiate) on Jan 27, 2009 at 22:39 UTC
    Thanks for your input - very useful. I skipped the BioPerl modules as I wanted the script to be passed around non-BioPerl users. After further help and modification the final product was.
    use strict; use warnings; my $in = $ARGV[0]; my $frequencies; my $maxCol = 0; open (IN, "$in") or die "cannot open $in!\n"; while (my $line = <IN>) { chomp $line; my ($name, $seq) = split ("\t", $line); if (defined $seq) { my @bases = split '', $seq; $maxCol=@bases; for (my $i=0; $i<@bases; $i++) { ++$frequencies->{$bases[$i]}[$i]; # if ($maxCol < @bases) { # $maxCol = @bases; # } } } } print_hoa($frequencies, $maxCol); #foreach my $base (keys %frequencies) { # $frequencies{$base}[$_] ||= 0 for 0 .. $maxCol; # print "$base @{$frequencies{$base}}\n"; #} ##Subroutine that prints a hash fo arrays reference. sub print_hoa { my ($hashref, $len) = @_; for my $person (sort keys %{$hashref}) { my $array = $hashref->{$person}; my @array_print = @$array; print "$person\t"; foreach my $num (@array_print) { if (defined $num) { print "$num "; } else { print "0 "; } } my $i =@array_print; if ($i<$len) { my $z_pr = $len-$i; print "0 " x $z_pr, "\n"; } else { print "\n"; } } }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://739096]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2024-04-26 07:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found