http://qs321.pair.com?node_id=1213297


in reply to REGEX help

Hello. Welcome.

I think you want to look at BioPerl.

This code is completely untested. I didn't want to install BioPerl on my system. But it might work if you can get BioPerl installed. (I know you're new to Perl, so installing BioPerl might be a stretch, but this might help: http://bioperl.org/INSTALL.html.)

#!/bin/perl use strict; use warnings; use Bio::Seq; use Data::Dumper::Simple; use feature "say"; # Convert the sequence to lower case. Upper Case might be ok, # but the docs for Bio::Seq used lower case, so let's go with that. my $letters = lc("TTCAGGTGTTTGCAACTGCGTTTTATTGCAAGAAAGAGTGGAGGGGTTTCCA +TGGGGCCCACCTCACAACCCACTC TTCACCCCCAAAATCACGCAGGGATCGGACTCAGGAAAGGGAAG +CATCTGTGTGTTGCATACGAGCCCTTCCTGTACTTACTTCTTTCACAGCAGGGAAGG AAGAGGGAAGA +GGCAGCTGTGGAGAGGATCAGGTTGCGGGAGGTGGGTATCTCGCTGCTCTGACCTTACGTACAGTCCTC +CACAGAAGCATCAAAGTGGACT GGCACATATCGGCTCCCTTCACAGGCCACAATCATCTGTCTCTCCT +TCGGGCTGGTCCGGTATCCAC"); #Create a sequence object. my $seq_object = Bio::Seq->new(-seq => $letters, -alphabet => 'dna' ); #Look for the ORF. I specified the start, but I didn't see how to #specify the stop. Are the stop codons universal? I'm way out of #my league here. $prot_object = $seq_object->translate( -orf => 1, -start => "atg" ); say Dumper $prot_object;

Cheers,

Brent

-- Yeah, I'm a Delt.