Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Parsing and Modifying a flat file in perl

by kennethk (Abbot)
on Jun 23, 2010 at 17:25 UTC ( [id://846120]=note: print w/replies, xml ) Need Help??


in reply to Parsing and Modifying a flat file in perl

What have you tried? Why didn't it work? Please read How do I post a question effectively?. In particular, input and expected output should be wrapped in code tags to maintain formatting. In addition, the mapping from your input to your output is not entirely obvious to me, and so you should explain that. Effort is appreciated around here.

The following code does something like what you need. Read it, consider it, and understand it. Post specific questions following site guidelines if anything is unclear.

#!/usr/bin/perl use strict; use warnings; my $buffer = ""; my $series = 1; $_ = <DATA>; # Skip first line while (<DATA>) { if (/>/) { my @elements = split /N+/, $buffer; for my $i (1 .. @elements) { print ">Count$series.$i\n$elements[$i-1]\n"; } $buffer = ""; $series++; } else { chomp; $buffer .= $_; } } my @elements = split /N+/, $buffer; for my $i (1 .. @elements) { print "Count$series.$i\n$elements[$i-1]\n"; } __DATA__ >sequence1 123.3 ATGACGTAGACGATGAGTAGACGATAGCAGTGACAGGTGAGTG ATGACGATGAGTAGAGACGGGGGTAGAGGGGGATAGATAGAGANNNNNNNN ATAGACAGATAANNNNNNNNNNNNNNNNNAGATGAGACAGATANNNNNNN >sequence2 143.5 ATGCGATGCNNNNNNCGTAGCTGANNNNNNCGATGCTGATGCTC CGTAGTCTGCTAGCTAGTCNNNNNNCGTAGTCGATCGATCGANNNNNNCGTGCATGC CGATGCTACGGATNNNNNCGATCGATCGATCGACNNNNNCGATCAGCTAG CCCCGCTAGTCANNNNN >sequence3 132.3 ATGCTGATCAGCTACGCTAGCNNNNNCGATCGATCGATCGACTAGCNNNNNNCGATCCGAGCT CGATCGATCGATCGATCGANNNNNCGATCGATCGACTAGCNNNNNCGATCGATCGA CGATCGATCGA >C1132423 123.4 ATCGTGCATGCATCGATCGACTACGCTGCTACGATCGACTGCTAGCTACGCTAC CGTCGATCGATCGACTACGCTGACTGACTAGCTAG >C1123234 176.4 GCTAGCGATCGCACCGATCGATCGTACGCTACGATCGATCGATCGATCGACTGT CGATCGATCGATCGATCGATCGA >C1123546 531.1 CGTAGCTACGATCGATCGATCGACTAGCTACGATCGATCGACTAGCTAGCTAGCTAG

Replies are listed 'Best First'.
Re^2: Parsing and Modifying a flat file in perl
by ad23 (Acolyte) on Jun 23, 2010 at 20:18 UTC

    I will keep in mind about the things you mentioned above (sorry about Formatting, this was my first post).

    I was trying something like this (just a snippet of my code):

    $scafSeq = $ARGV[0]; open (IN, "< $scafSeq"); while ( $line = <IN> ) { chomp $line; $line =~ s/^\s+//g; $line =~ s/sequence/count/g; next if $line eq ""; if(substr($line,0,1) eq ">") { ($scaff) = $line =~ /^>(\S+)/; } else { $scaffData -> {$scaff} .= $line; } }

    And then I sort the keys and split it with N. Although this approach was sorting the data for >sequence1, etc , it was not working for >C1113456... data.

    Your code is short and effective, to do this job. Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://846120]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-19 19:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found