cosmin has asked for the wisdom of the Perl Monks concerning the following question:
Hello everyone! I have to extract the CDS zones from one chromosome file to an secondary text file. How I manage to extract all of them ? I've read that these zones are considered as tags, and their join complement as secondary tags... Also, i have to list the ORIGIN zone, but i have to cross over the nucleotide string that fix with the one from the CDS and then to write it in the secondary text file bellow the CDS. Can somebody help me handle with this task? I'm newbie in perl..
This is my task: You must to design a program capable of extracting only the CDS sections of such a file, which are described in the FEATURES section, to which should be added their corresponding nucleotide sequences described in the ORIGIN section, thus creating a new .txt file with a much simpler structure. The designed program must extract from the original file all portions of the CDS with their description to which it must add, by selective extraction from the ORIGIN section, the corresponding nucleotide sequence, thus creating a new .txt file with, in order, only the descriptions of CDS in which the corresponding nucleotide sequences appear.
I have wrote this code, but I don't know if it is really good, I think it needs more improvement.. but I'm still stucked here
#!/usr/bin/perl -w use warnings; use warnings FATAL => q{void}; use warnings FATAL => 'syntax'; use strict; use Data::Dumper; print "Enter your chromosome"; my $chromosome = <STDIN>; chomp $chromosome; print "Your file is '$chromosome'\n"; # reading whole file $file_name = $ARGV[0]; open(my $file, '<', $file_name) or die "Sorry, we can't open your $fil +e_name $!"; @content= <$file>; close $file; print "\n\n"; $index_max = $#content; for ($start=0; $start <= $index_max; $start++) { chomp $content[$i]; print "Reg $i: $content[$i]\n"; if ($content[$start] =~ /CDS/ ) { print "There is CDS \n"; for ($start;$start <= $index_max; $start++) { @cds = (@cds, $content[$start]); print @cds; } open (WRITE, ">>concatenare.txt"); print WRITE @cds; close WRITE; } #This chromosome have tags. Identify CDS ones. my @features =$mySeq ->all_seqfeatures(); foreach $feature (@features) { my @tag = $CDS; my @feat = $join; foreach $tag ( $feat->all tags() ); print "Feature region has tag", $tag, "CDS", join(‘ ‘,$feat->each tag value($tag)), "\n"; } if ($content[$start] =~ /ORIGIN/ ) { print "There is ORIGIN !! \n"; for ($start;$start <= $index_max; $start++) { @origin = (@origin, $content[$start]); shift (@origin); #Delete ORIGIN row; print @origin; for @origin { while ($_ =~ m/[ACGTURYKMSWBDHVN]/ig) { $seq = $seq.$&; } } $lenght = length($seq); print "The sequence has $nucleotide length\n"; print "$seq"; } } } print "There is ",$index_max," inregistrari\n"; print "This are:\n\n"; my $file = "concatenare.txt"; my $succ = open( my $fh , '>>', $file ); $fh = *STDOUT unless $succ; print $fh "CDS1 \n"; print $fh "Nucleotide sequence \n"; print $fh "CDS2 \n"; print $fh "Nucleotide sequence \n";
|
---|