Re: renaming 1000's of FASTA files

in reply to renaming 1000's of FASTA files

Is there anything sensible I can do to speed this up?

Stop reading the same file over and over again. Maybe something like this can help you (untested):

#!/usr/bin/perl

use strict;
use warnings;
use Bio::SeqIO;
use Data::Dumper;
my %seq_id;
open HEADER , "<FASTA.headers" or die $!;
while (<HEADER>){
    chomp $_;
    my $fasta_id = $_;
    $fasta_id =~ s/_.*//g ;
    $seq_id{$fasta_id} => $_;
}

my $infile = $ARGV[0] || die ("Please give me an input fasta file\n");
my $inseq = new Bio::SeqIO(-format => 'fasta',
    -file => $infile);
while (my $seq_obj = $inseq->next_seq ) {
    my $id = $seq_obj->id ;
    chomp $id;
    my $seq = $seq_obj->seq ;
    if (exists ($seq_id{$id})) {
        print ">";
        print $seq_id{$fasta_id};
        print "\n".$seq."\n";
    }
}
[download]

Perl 6 - second systems done right

Comment on Re: renaming 1000's of FASTA files Download Code

Replies are listed 'Best First'.
Re^2: renaming 1000's of FASTA files by Cristoforo (Curate) on Jul 11, 2011 at 17:03 UTC
Shouldn't this code: `if (exists ($seq_id{$id})) { print ">"; print $seq_id{$fasta_id}; print "\n".$seq."\n"; }` [download] be `if (exists ($seq_id{$id})) { print ">"; print $seq_id{$id}; print "\n".$seq."\n"; }` [download] and `$seq_id{$fasta_id} => $_;` be `$seq_id{$fasta_id} = $_;` Update: Corrected `print $seq_id{$id};` from `print $id`	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom