1. RAM problem.
RAM is 512MB and swap 1GB.
A file typically is 1.2MB
No. of lines (atoms) 20,000.
Note: This is in test phase, but it can increase by about 10-50 fold.
2. Graph Theory problem.
This unfortunately is not a Graph-Theory problem, but more of a gas-phase problem. I need to find out all the 'interacting-pairs' (ie pairs of atoms close enough) to do a more complicated analysis.
3. FORTRAN object file.
Ok, may be I can make this piece into a library file, so that I should be able to use it as a POSIX funtion?
4. Grid-wise calculation.
Promising. :) I thought, but was a bit lazy to try. After brute-force I began to wonder if it would be worth the effort... Thanks. :)
5. Chemistry::Bond::Find
Yes, I must try this. :) Mine's is a protein with hell-lot of water, and I have to find water-protein hydrogen bonds. At least based on distance alone. Will get back to you. itub. :D
6. Using square of distance.
Well, yes, I was already using the square of the distance, and yes, it is on PDB files. :)
7. Using x:y:z boxinfo.
Yes. I thought of it, but was lazy as I thought I need to put in lot of code. But now I am convinced it won't be so much. Thanks a lot BrowserUk. :)
8. The code:
# Open each PDB
foreach my $pdb_file (<$pdb_list>) {
{
chomp($pdb_file);
my $tmp_file; # We would open the PDB with this handle
open($tmp_file, "< $pdb_file") or (die "Cannot open PDB: $
+!");
# Read X,Y,Z coordinates
my @X; # X-coordinates
my @Y; # Y-coordinates
my @Z; # Z-coordinates
my @pdb_tmp=<$tmp_file>;
foreach (@pdb_tmp) {
if (substr($_,0,3) eq "ATO") {
push @X, substr($_,30,8);
push @Y, substr($_,38,8);
push @Z, substr($_,46,8);
}
}
# Find the interaction pairs. Also strore the best angle, sh
+ould it
# occur again with another proton. Also keep track of the wa
+ters
# that have made h-bond with the solute atoms.
my @sel_wat; # Array of water molecule (numbers) selected
my %hbond; # $hbond[$tag1:$tag2][0]=distance
# $hbond[$tag1:$tag2][0]=angle
# Solute as donor and water as acceptor
{
my @atom_cov; # Polar solute atom that is already cover
+ed.
for (my $i=0; $i <= $#pol_h; $i++) {
# If this donor is not already covered, then go ahead.
if ( ! defined($atom_cov[$pol_h[$i][1]]) ) {
for (my $j=0; $j <= $#wat_a; $j++) {
my $dx=$X[$pol_h[$i][1]]-$X[$wat_a[$j]];
my $dy=$Y[$pol_h[$i][1]]-$Y[$wat_a[$j]];
my $dz=$Z[$pol_h[$i][1]]-$Z[$wat_a[$j]];
my $distSq=($dx*$dx)+($dy*$dy)+($dz*$dz);
if ($distSq <= $hb_dist) {
$atom_cov[$pol_h[$i][1]]=1;
print $idty[$pol_h[$i][1]], " ",
$idty[$wat_a[$j]], $distSq,"\n";
}
}
}
}
}
}
}
| [reply] [d/l] |
Regarding the issue of finding H-bonds in PDB files:
Besides all the algorithmic suggestions already given, The problem may be even "smaller" than it looks. If your PDB file has water molecules and the protein labeled properly, you now have to consider only water atom-protein atom pairs, instead of every possible pair. And not even every protein atom, if you restrict your definition of H-bond to the typical O...H or N...H.
| [reply] |
Yes, at the moment I am looking only for the distance, though I am leaving the in a shape to impliment angle when required. Even for that it was taking too long a time. :( Reason being the number of atoms is about 20,000. I am looking for something that can work upto 100K
| [reply] |
2. Graph Theory problem.
This unfortunately is not a Graph-Theory problem, but more of a gas-phase problem. I need to find out all the 'interacting-pairs' (ie pairs of atoms close enough) to do a more complicated analysis.
Your APPLICATION may be a gas-analysis problem, but the guts of it, finding points within a certain distance of a known point, is exactly a graph-theory problem.
| [reply] |
| [reply] |