Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?

by supriyoch_2008 (Monk)
on Apr 25, 2012 at 15:00 UTC ( [id://967084]=perlquestion: print w/replies, xml ) Need Help??

supriyoch_2008 has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.
  • Comment on How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?
  • Select or Download Code

Replies are listed 'Best First'.
Re: How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?
by davido (Cardinal) on Apr 25, 2012 at 16:02 UTC

      That brings up a question I've been wondering about: Obviously we get a lot of questions about mining bioinformatics data, from people in that business who are trying to become Perl programmers in their spare time. (There's nothing wrong with that, of course, although when I started learning Perl, I started with "hello world," not "mine gigabytes of data for complex character patterns.")

      So, with many bioinformatics folks doing it themselves, I figure there must be many more who would rather hire a programmer. I'd further guess that many wouldn't need a full-time person, just someone they can call to put together quick scripts. Is that what the pros here are seeing? Is there a large demand in the industry for Perl programmers? Would it make sense to study up on how the data works, to be able to promote oneself as a "bioinformatics data mining guy"?

      Aaron B.
      My Woefully Neglected Blog, where I occasionally mention Perl.

        Perl has always been a language dedicated to getting things done. While many of us (myself included) enjoy the exploration of deeper topics, many who use it are more interested in the result than in the tool used to obtain the result. There's nothing wrong with that. But as you've identified, it might benefit some of those people to hire someone. Nevertheless, one of Perl's strengths is that it is within reach of the "weekend mechanics" of programming. If you need to rebuild a car's transmission you'll probably send that out to a mechanic. But if all you're doing is changing brake pads or even building a go-cart with a lawn mower engine, you might tackle that yourself just because you can. That's one of Perl's strengths; the weekend programmer, non-CS student, sysadmin, biologist, and sales manager can all accomplish a lot with the "baby Perl" subset.

        As I attend Perl Mongers meetings, and as I work with clients, it's easy to forget that not everyone is building big web applications sitting on top of database abstractions and powerful frameworks. Not everyone has a release manager, version control, a QA department, unit testing requirements, and all those other things that are common in "the industry." Perl is used within the programming industry, but it's also heavily used just to get things done.

        Whether there's money to be had seeking contracts in the bioinformatics industry, I have no idea. I've always thought (perhaps wrongly so) that many of our bioinformatics questions are coming from academia, which is not necessarily a pot of gold.


        Dave

      Hi Dave, there is a third possibility when a user seems not to be learning anything: A shared account.



      - Boldra
Re: How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?
by johngg (Canon) on Apr 25, 2012 at 15:32 UTC

    Use a hash of hashes rather than individual scalar count variables. You could use unpack to break the string into words and a global regex match and capture for the individual positions.

    knoppix@Microknoppix:~$ perl -Mstrict -MData::Dumper -wE ' > my $str = q{ATCGGCGCCTAT}; > my @words = unpack q{(a3)*}, $str; > my %counts; > > foreach my $word ( @words ) > { > my $posn; > $counts{ q{position } . ++ $posn }->{ $1 } ++ > while $word =~ m{(.)}g; > } > > print Data::Dumper->Dumpxs( [ \ %counts ], [ qw{ *counts } ] );' %counts = ( 'position 1' => { 'A' => 1, 'T' => 1, 'G' => 2 }, 'position 3' => { 'T' => 1, 'C' => 3 }, 'position 2' => { 'A' => 1, 'T' => 1, 'C' => 1, 'G' => 1 } ); knoppix@Microknoppix:~$

    I hope this is helpful.

    Cheers,

    JohnGG

Re: How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?
by NetWallah (Canon) on Apr 25, 2012 at 15:41 UTC
    Build and print a Hash of Arrayrefs like this:
    my ($i,%x); $i=0; $x{$_}[$i++%3]++ for split //, $a; for my $k (sort keys %x){ my $aref = $x{$k}; print "$k "; print for @$aref; print $_; }
    (untested)

                 All great truths begin as blasphemies.
                       ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

Re: How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?
by brx (Pilgrim) on Apr 25, 2012 at 17:52 UTC

    First, read davido's answer: Re: How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?

    Here, I put some code, as easy as possible (I hope) without complexe data structure:

    #!perl use strict; use warnings; my $seq = "ATCGGCGCCTAT" ; my (%first,%second,%third); #perlre my @trilet = $seq =~ /.../g; #perlsyn LOOP foreach my $letter ('A','T','G','C') { #init $first{ $letter }=0; $second{ $letter }=0; $third{ $letter }=0; } foreach my $tri (@trilet) { #perlfunc : substr $first{ substr $tri,0,1 }++; $second{ substr $tri,1,1 }++; $third{ substr $tri,2,1 }++; } foreach my $letter ('A','T','G','C') { print "$letter=$first{$letter}; "; } print "\n"; foreach my $letter ('A','T','G','C') { print "$letter=$second{$letter}; "; } print "\n"; foreach my $letter ('A','T','G','C') { print "$letter=$third{$letter}; "; } print "\n";
Re: How can I count the number and kinds of letters at 1st, 2nd and 3rd positions of 3-letter words in a string?
by BillKSmith (Monsignor) on Apr 27, 2012 at 03:37 UTC

    One more way to parse the sequence. The initialization statement is only needed if zero-counts must be defined.

    use strict; use warnings; use Readonly; use Data::Dumper qw( Dumper ); Readonly::Scalar my $seq => "ATCGGCGCCTAT" ; my %count; @count{ qw(A1 A2 A3 T1 T2 T3 C1 C2 C3 G1 G2 G3 ) } = (0) x (3*4); foreach my $i (0 .. length($seq)-1 ) { my $pos = $i % 3 + 1; my $base = substr $seq, $i, 1; $count{$base.$pos}++; } $Data::Dumper::Sortkeys = 1; print Dumper \%count;

    Or if you really want individual scalar counts and do not mind global variables or symbolic references.

    use strict; use warnings; my $seq = 'ATCGGCGCCTAT' ; my %count; our( $A1, $A2, $A3, $T1, $T2, $T3, $C1, $C2, $C3, $G1, $G2, $G3 ) = (0) x 12; foreach my $i (0 .. length($seq)-1 ) { my $pos = $i % 3 + 1; my $base = substr $seq, $i, 1; {no strict 'refs'; ${$base.$pos}++;} } print $A1, $A2, $A3, $T1, $T2, $T3, $C1, $C2, $C3, $G1, $G2, $G3;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://967084]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-16 17:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found