Hello everyone,
I am trying to write a script in perl which will do the following
it will read a pdb file that contains only Ca atoms as the following
1 2 3 4 5 6
ATOM 1 CA PRO A 889 84.370 72.820 26.830 1.00 0.00
+
ATOM 2 CA THR A 890 87.370 73.900 28.080 1.00 0.00
+
ATOM 3 CA VAL A 891 90.920 72.490 27.750 1.00 0.00
+
ATOM 4 CA PHE A 892 93.640 74.890 28.970 1.00 0.00
+
ATOM 5 CA HIS B 893 97.060 74.200 27.360 1.00 0.00
+
ATOM 6 CA LYS B 894 99.880 73.920 29.990 1.00 0.00
it will read a second pdb that contains every atom
1 2 3 4 5 6
ATOM 1 N PRO A 889 16.220 12.185 1.804 1.00 71.54
+ N
ATOM 2 CA PRO A 889 16.101 12.990 3.034 1.00 70.89
+ C
ATOM 3 C PRO A 889 15.432 14.346 2.803 1.00 72.31
+ C
ATOM 4 O PRO A 889 14.743 14.852 3.703 1.00 72.20
+ O
ATOM 5 CB PRO A 889 17.553 13.151 3.502 1.00 72.96
+ C
ATOM 6 CG PRO A 889 18.315 12.067 2.782 1.00 78.00
+ C
ATOM 7 CD PRO A 889 17.626 11.907 1.465 1.00 73.35
+ C
(The files refer to the same molecule but have different number of lines)
So if the residue number (column num 5) is the same it will take the chain letter (column num 4) from the first file and replace all the chain letters that have the same residue number in the second file. So far i've got this disaster :/
print "\nEnter the network pdb file file: ";
$inputFile = <STDIN>;
chomp $inputFile;
unless (open(INPUTFILE, $inputFile)) {
print "Cannot read from '$inputFile'";
<STDIN>;
exit;
}
# load the file into an array
chomp(@networkpdb = <INPUTFILE>);
# close the file
close(INPUTFILE);
print "\nEnter the pdb output file: ";
$inputFile2 = <STDIN>;
chomp $inputFile2;
unless (open(INPUTFILE, $inputFile2)) {
print "Cannot read from '$inputFile2'";
<STDIN>;
exit;
}
chomp(@pdb = <INPUTFILE>);
close(INPUTFILE);
for ($line1 = 0; $line1 < scalar @networkpdb; $line1++) {
if ($networkpdb[$line1] =~ m/ATOM\s+\d+\s+\w+\s+\w{3}\s*(\w+)\s*(\
+d*)\s+\S+\.\S+\s+\S+\.\S+\s+\S+\.\S+\s+.+\..+\..*/ig) {
my $resnum=$2;
my $chain=$1;
for ($line = 0; $line < scalar @pdb; $line++) {
if ($pdb[$line]=~ m/(ATOM\s+\d+\s+\w+\s+\w{3}\s*)(\w+)\s*(\d*)(\s
++\S+\.\S+\s+\S+\.\S+\s+\S+\.\S+\s+.+\..+\..*)/ig) {
my $begining=$1;
my $resnum1=$3;
my $chain1=$2;
my $end=$4;
if ($resnum1=$resnum)
{$chain1=$chain;
$parsedData{$line} = $begining.$chain1."\s".$resnum1.$end;
}}}}}
# create the output file name
$outputFile = "WithNetwork_".$inputFile;
# open the output file
open (OUTFILE, ">$outputFile");
# print the data lines
foreach $line (sort {$a <=> $b} keys %parsedData) {
print OUTFILE $parsedData{$line}."\n";
}
# close the output file
close (OUTFILE);
thank you very much in advance