Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: unique sequences

by Laurent_R (Canon)
on Dec 11, 2017 at 07:19 UTC ( [id://1205281]=note: print w/replies, xml ) Need Help??


in reply to unique sequences

Hi Anonymous Monk,

it is difficult to guess what you should be obtaining without seeing the input, but the output get is in line with the code you've shown. Your code is basically discarding the "comments" (that's how I call the lines starting with >, for lack of a better description) and then looks for sequences of ten nucleotides (I hope this is the right term) followed by GG. And that's pretty much what you have in your output. So, to me, you get what you ask for.

Please explain in plain English what you need to extract and in which respect the output you get is not what you want or need.

As a side note, it may or may not be relevant or important, but please remember that a hash does not preserve the order in which the data were populated into it.

Replies are listed 'Best First'.
Re^2: unique sequences
by BillKSmith (Monsignor) on Dec 11, 2017 at 20:39 UTC
    The name of the file-handle suggests that the input is in FASTA format. The reference article indicates that a file may contain more than one sequence. Each sequence is prefixed with a '>' line which specifies its name and may contain comments. If the input contains more than one sequence, the script combines them. If the file is known to contain only one sequence, it is overkill to test every line for '>'.
    Bill

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1205281]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2024-04-25 22:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found