in reply to REGEX help
Hello. Welcome.
I think you want to look at BioPerl.
This code is completely untested. I didn't want to install BioPerl on my system. But it might work if you can get BioPerl installed. (I know you're new to Perl, so installing BioPerl might be a stretch, but this might help: http://bioperl.org/INSTALL.html.)
#!/bin/perl use strict; use warnings; use Bio::Seq; use Data::Dumper::Simple; use feature "say"; # Convert the sequence to lower case. Upper Case might be ok, # but the docs for Bio::Seq used lower case, so let's go with that. my $letters = lc("TTCAGGTGTTTGCAACTGCGTTTTATTGCAAGAAAGAGTGGAGGGGTTTCCA +TGGGGCCCACCTCACAACCCACTC TTCACCCCCAAAATCACGCAGGGATCGGACTCAGGAAAGGGAAG +CATCTGTGTGTTGCATACGAGCCCTTCCTGTACTTACTTCTTTCACAGCAGGGAAGG AAGAGGGAAGA +GGCAGCTGTGGAGAGGATCAGGTTGCGGGAGGTGGGTATCTCGCTGCTCTGACCTTACGTACAGTCCTC +CACAGAAGCATCAAAGTGGACT GGCACATATCGGCTCCCTTCACAGGCCACAATCATCTGTCTCTCCT +TCGGGCTGGTCCGGTATCCAC"); #Create a sequence object. my $seq_object = Bio::Seq->new(-seq => $letters, -alphabet => 'dna' ); #Look for the ORF. I specified the start, but I didn't see how to #specify the stop. Are the stop codons universal? I'm way out of #my league here. $prot_object = $seq_object->translate( -orf => 1, -start => "atg" ); say Dumper $prot_object;
Cheers,
Brent
-- Yeah, I'm a Delt.
In Section
Seekers of Perl Wisdom