Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Compare fasta files

by GotToBTru (Prior)
on Nov 20, 2016 at 03:35 UTC ( [id://1176183]=note: print w/replies, xml ) Need Help??


in reply to Compare fasta files

It appears that comparing the files consists in comparing the first two elements in each line. This doesn't make sense based on the information I have seen about the format of FASTA files. Perhaps you can provide a short sample of two of the files you will be working with?

Also, you can't use simple scalar comparisons with hashes. Modules like Data::Compare can help. See also compare hash.

But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

Replies are listed 'Best First'.
Re^2: Compare fasta files
by cboPerl (Initiate) on Nov 20, 2016 at 14:17 UTC
    I am posting the header of one of the .fasta file:
    >167440 TCONS_00167441 scaffold_2269+ 284-1043 AGGGCTCAAGCTTTATTTCACGTAGCTGACTTTACCGTCAGCTCAATTGGAATAGTTTTT CGCTATGTTCGCAGGCAAGTGAGACGATCCATCAATGCCCTTATCTGCTTCGAAAGAACC GGTGTCATCCAAACATGGTGAAGAGGTGGCAACTGGATCAATAATAGCTGAAACTTCTAC TGTACAGGGTTCGGCTTGCCCAACTGTCCAAGCTTGAGATCTATTTTAGAATATGCTTAA CACAACACATGCAATTCGAACGTTGTTTTCTCGGAAAGATTTGAAAGTAACTCCGTTGGG TTCAATGCCCGCTAGTCCCATGCATCCTTTCTGTTGGTCAACAACCAACCACAAGTCAAT CGAATGAATTCTTCAAGACTCCGGACTCTCTTTCTGTCCGGAGGGAATCATTGTTTCTCA ATCAATCATGCCTCAACTGGATAAATTCACTTATTTCACACAATTTTTCTGGTCATGCCT TTTCCTCTTTACTTTTTATATTCCCATATGCAATGATGGAGATGGAGTACTTGGGATCAG

      Your code does not account for the header line, nor would it work on the sequence data since it contains no white space. The replies to How to get non-redundant DNA sequences from a FASTA file? might provide some good insight into how to work with your files. There are packages Bio::Perl and Bio::SeqIO that you might find useful.

      In general, I strongly suggest you use Super Search and search for FASTA. I'd suggest restricting the search to root nodes (there are radio buttons to exclude replies). See what your colleagues have been asking, because questions about FASTA files come pretty regularly here.

      But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1176183]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2024-04-23 20:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found