Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: how to read input from a file, one section at a time?

by jwkrahn (Abbot)
on Feb 21, 2019 at 22:49 UTC ( [id://1230340]=note: print w/replies, xml ) Need Help??


in reply to how to read input from a file, one section at a time?

This may work: (UNTESTED)

open my $out_file, '>', 'aa_report.txt' or die "Cannot open 'aa_report +.txt' because: $!"; print 'PLEASE ENTER THE FILENAME OF THE PROTEIN SEQUENCE: '; chomp( my $prot_filename = <STDIN> ); open my $PROTFILE, '<', $prot_filename or die "Cannot open '$prot_file +name' because: $!"; $/ = ''; # Set paragraph mode while ( my $para = <$PROTFILE> ) { # Remove fasta header line $para =~ s/^>.*//m; # Remove comment line(s) $para =~ s/^\s*#.*//mg; my %prot; $para =~ s/([A-Z])/ ++$prot{ $1 } /eg; print $out_file join( ' ', map "$_=$prot{$_}", sort keys %prot ), +"\n"; }

Replies are listed 'Best First'.
Re^2: how to read input from a file, one section at a time?
by davi54 (Sexton) on Feb 21, 2019 at 23:38 UTC

    Hi, thank you so much. It works.

    I have a following question. Once I have the output in the output file, I want to read the number of alphabets present in the output file for each sequence.

    Example:

    for the output: "A=27 C=3 D=16 E=14 F=1 G=11 H=4 I=12 K=10 L=13 M=2 N=6 P=10 Q=6 R=6 S=14 T=20 V=11 W=4 Y=10"

    I want it to count the alphabets in each result and return how many alphabets are present. In the above example, 20 alphabets are present. So I want my final output to look something like: number of alphabets = 20. How can I do it?
      The line below will give you those counts.

      print $out_file "Number of proteins = ", scalar keys %prot, "\n";

        Hi, Thank you. It works perfectly.

        One last thing I would like my script to do is tho look at those "number of proteins" and tell me which entry in the original input file has the smallest number of proteins. So, it should look like following in the output:

        A=11 D=5 E=12 F=1 G=5 I=6 K=3 L=7 M=2 N=4 P=2 Q=9 R=10 T=4 V=10

        Number of proteins = 15

        Entry ">sp|Q2M7X4|YICS_ECOLI Uncharacterized protein YicS OS=Escherichia coli (strain K12) OX=83333 GN=yicS PE=4 SV=1" has the least number of proteins

      >type davi54.pl use strict; use warnings; my $string = "A=27 C=3 D=16 E=14 " . "F=1 G=11 H=4 I=12 " . "K=10 L=13 M=2 N=6 " . "P=10 Q=6 R=6 S=14 " . "T=20 V=11 W=4 Y=10" ; print $string =~ tr/A-Z/A-Z/; >perl davi54.pl 20
      Bill

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1230340]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (2)
As of 2024-04-19 01:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found