Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

How to select specific lines from a file

by Anonymous Monk
on Apr 29, 2014 at 17:11 UTC ( [id://1084358]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I am having some issues with PDB files (for those that are into Bioinformatics that is...). This file has the format:

ATOM 30 N HIS A 66 7.514 15.296 11.222 1.00 12.98 + A N ATOM 31 CA HIS A 66 7.318 14.688 12.568 1.00 12.48 + A C ATOM 32 C HIS A 66 8.676 14.309 13.156 1.00 11.62 + A C ATOM 33 O HIS A 66 9.708 14.518 12.545 1.00 11.76 + A O ATOM 34 CB HIS A 66 6.450 13.434 12.442 1.00 12.81 + A C ATOM 35 CG HIS A 66 5.000 13.829 12.378 1.00 13.36 + A C ATOM 36 ND1 HIS A 66 4.332 14.002 11.175 1.00 13.57 + A N ATOM 37 CD2 HIS A 66 4.073 14.085 13.360 1.00 13.93 + A C ATOM 38 CE1 HIS A 66 3.063 14.347 11.461 1.00 14.23 + A C ATOM 39 NE2 HIS A 66 2.851 14.410 12.778 1.00 14.47 + A N ATOM 40 N HIS A 67 8.695 13.753 14.336 1.00 10.90 + A N ATOM 41 CA HIS A 67 9.995 13.365 14.954 1.00 10.21 + A C ATOM 42 C HIS A 67 9.781 12.968 16.417 1.00 9.43 + A C ATOM 43 O HIS A 67 10.182 11.906 16.847 1.00 9.32 + A O ATOM 44 CB HIS A 67 10.961 14.553 14.889 1.00 10.37 + A C ATOM 45 CG HIS A 67 12.242 14.124 14.232 1.00 10.53 + A C ATOM 46 ND1 HIS A 67 13.309 13.614 14.955 1.00 10.58 + A N ATOM 47 CD2 HIS A 67 12.644 14.121 12.918 1.00 10.88 + A C ATOM 48 CE1 HIS A 67 14.291 13.330 14.081 1.00 10.97 + A C ATOM 49 NE2 HIS A 67 13.939 13.620 12.826 1.00 11.15 + A N ATOM 50 N PHE A 68 9.157 13.817 17.187 1.00 9.11 + A N ATOM 51 CA PHE A 68 8.928 13.491 18.622 1.00 8.55 + A C ATOM 52 C PHE A 68 7.941 12.326 18.743 1.00 7.46 + A C ATOM 53 O PHE A 68 7.041 12.176 17.939 1.00 7.39 + A O ATOM 54 CB PHE A 68 8.356 14.718 19.338 1.00 8.98 + A C ATOM 55 CG PHE A 68 9.474 15.476 20.012 1.00 9.76 + A C ATOM 56 CD1 PHE A 68 10.320 16.294 19.252 1.00 10.29 + A C ATOM 57 CD2 PHE A 68 9.668 15.358 21.393 1.00 10.19 + A C ATOM 58 CE1 PHE A 68 11.359 16.997 19.877 1.00 11.19 + A C ATOM 59 CE2 PHE A 68 10.706 16.060 22.017 1.00 11.10 + A C ATOM 60 CZ PHE A 68 11.552 16.880 21.258 1.00 11.57 + A C ATOM 61 N SER A 69 8.099 11.502 19.744 1.00 6.86 + A N ATOM 62 CA SER A 69 7.169 10.349 19.925 1.00 6.01 + A C ATOM 63 C SER A 69 7.060 9.557 18.620 1.00 5.07 + A C ATOM 64 O SER A 69 7.602 9.936 17.600 1.00 5.28 + A O ATOM 65 CB SER A 69 5.787 10.870 20.316 1.00 6.54 + A C ATOM 66 OG SER A 69 5.072 11.237 19.142 1.00 7.08 + A O ATOM 67 N GLU A 70 6.360 8.457 18.648 1.00 4.41 + A N ATOM 68 CA GLU A 70 6.205 7.633 17.416 1.00 3.84 + A C ATOM 69 C GLU A 70 4.716 7.484 17.092 1.00 2.84 + A C ATOM 70 O GLU A 70 4.056 6.606 17.612 1.00 3.00 + A O ATOM 71 CB GLU A 70 6.813 6.249 17.649 1.00 4.55 + A C ATOM 72 CG GLU A 70 8.251 6.398 18.150 1.00 5.41 + A C ATOM 73 CD GLU A 70 9.076 5.193 17.695 1.00 6.00 + A C ATOM 74 OE1 GLU A 70 9.596 5.236 16.592 1.00 6.49 + A O ATOM 75 OE2 GLU A 70 9.175 4.246 18.458 1.00 6.28 + A O ATOM 76 N PRO A 71 4.226 8.354 16.244 1.00 2.18 + A N ATOM 77 CA PRO A 71 2.810 8.341 15.838 1.00 1.51 + A C ATOM 78 C PRO A 71 2.457 7.010 15.165 1.00 1.23 + A C ATOM 79 O PRO A 71 2.811 6.762 14.027 1.00 1.13 + A O ATOM 80 CB PRO A 71 2.675 9.502 14.842 1.00 1.87 + A C ATOM 81 CG PRO A 71 4.070 10.160 14.696 1.00 2.49 + A C ATOM 82 CD PRO A 71 5.041 9.412 15.622 1.00 2.65 + A C ATOM 83 N GLU A 72 1.757 6.154 15.858 1.00 1.29 + A N ATOM 84 CA GLU A 72 1.380 4.847 15.257 1.00 1.13 + A C ATOM 85 C GLU A 72 0.686 5.089 13.917 1.00 1.00 + A C ATOM 86 O GLU A 72 0.892 4.370 12.960 1.00 0.97 + A O ATOM 87 CB GLU A 72 0.425 4.107 16.200 1.00 1.32 + A C ATOM 88 CG GLU A 72 1.202 3.588 17.411 1.00 1.94 + A C ATOM 89 CD GLU A 72 1.030 4.556 18.583 1.00 2.47 + A C ATOM 90 OE1 GLU A 72 0.067 4.403 19.316 1.00 2.89 + A O ATOM 91 OE2 GLU A 72 1.864 5.436 18.726 1.00 3.01 + A O ATOM 92 N ILE A 73 -0.133 6.101 13.839 1.00 0.99 + A N ATOM 93 CA ILE A 73 -0.837 6.398 12.569 1.00 0.91 + A C ATOM 94 C ILE A 73 0.178 6.773 11.488 1.00 0.80 + A C ATOM 95 O ILE A 73 0.103 6.312 10.369 1.00 0.74 + A O ATOM 96 CB ILE A 73 -1.797 7.563 12.793 1.00 1.03 + A C ATOM 97 CG1 ILE A 73 -2.739 7.663 11.602 1.00 1.05 + A C ATOM 98 CG2 ILE A 73 -1.005 8.866 12.931 1.00 1.28 + A C ATOM 99 CD1 ILE A 73 -3.854 6.627 11.754 1.00 1.48 + A C ATOM 100 N THR A 74 1.127 7.609 11.812 1.00 0.81 + A N ATOM 101 CA THR A 74 2.137 8.010 10.793 1.00 0.76 + A C ATOM 102 C THR A 74 2.836 6.763 10.246 1.00 0.68 + A C ATOM 103 O THR A 74 2.992 6.606 9.050 1.00 0.62 + A O ATOM 104 CB THR A 74 3.174 8.941 11.427 1.00 0.86 + A C ATOM 105 OG1 THR A 74 3.745 8.312 12.563 1.00 1.35 + A O ATOM 106 CG2 THR A 74 2.495 10.247 11.847 1.00 1.19 + A C ATOM 107 N LEU A 75 3.258 5.873 11.102 1.00 0.72 + A N ATOM 108 CA LEU A 75 3.946 4.644 10.604 1.00 0.69 + A C ATOM 109 C LEU A 75 2.948 3.778 9.834 1.00 0.58 + A C ATOM 110 O LEU A 75 3.230 3.323 8.746 1.00 0.50 + A O ATOM 111 CB LEU A 75 4.511 3.843 11.781 1.00 0.82 + A C ATOM 112 CG LEU A 75 5.693 2.991 11.307 1.00 0.88 + A C ATOM 113 CD1 LEU A 75 6.011 1.928 12.360 1.00 0.87 + A C ATOM 114 CD2 LEU A 75 5.338 2.302 9.986 1.00 1.02 + A C ATOM 115 N ILE A 76 1.782 3.545 10.378 1.00 0.62 + A N ATOM 116 CA ILE A 76 0.798 2.715 9.636 1.00 0.57 + A C ATOM 117 C ILE A 76 0.546 3.368 8.277 1.00 0.45 + A C ATOM 118 O ILE A 76 0.597 2.726 7.247 1.00 0.38 + A O ATOM 119 CB ILE A 76 -0.512 2.633 10.426 1.00 0.69 + A C ATOM 120 CG1 ILE A 76 -0.268 1.887 11.738 1.00 0.91 + A C ATOM 121 CG2 ILE A 76 -1.558 1.881 9.599 1.00 0.76 + A C ATOM 122 CD1 ILE A 76 -1.449 2.116 12.685 1.00 1.10 + A C ATOM 123 N ILE A 77 0.311 4.652 8.270 1.00 0.44 + A N ATOM 124 CA ILE A 77 0.097 5.363 6.982 1.00 0.36 + A C ATOM 125 C ILE A 77 1.322 5.151 6.095 1.00 0.28 + A C ATOM 126 O ILE A 77 1.215 4.831 4.927 1.00 0.25 + A O ATOM 127 CB ILE A 77 -0.094 6.859 7.243 1.00 0.39 + A C ATOM 128 CG1 ILE A 77 -1.441 7.088 7.931 1.00 1.21 + A C ATOM 129 CG2 ILE A 77 -0.063 7.615 5.915 1.00 1.18 + A C ATOM 130 CD1 ILE A 77 -1.571 8.564 8.316 1.00 1.58 + A C ATOM 131 N PHE A 78 2.490 5.338 6.650 1.00 0.31 + A N ATOM 132 CA PHE A 78 3.737 5.156 5.858 1.00 0.32 + A C ATOM 133 C PHE A 78 3.768 3.744 5.274 1.00 0.30 + A C ATOM 134 O PHE A 78 4.042 3.555 4.109 1.00 0.30 + A O ATOM 135 CB PHE A 78 4.961 5.360 6.756 1.00 0.41 + A C ATOM 136 CG PHE A 78 5.322 6.825 6.795 1.00 0.80 + A C ATOM 137 CD1 PHE A 78 4.315 7.795 6.880 1.00 1.52 + A C ATOM 138 CD2 PHE A 78 6.664 7.215 6.742 1.00 1.66 + A C ATOM 139 CE1 PHE A 78 4.651 9.153 6.913 1.00 1.97 + A C ATOM 140 CE2 PHE A 78 7.003 8.573 6.775 1.00 2.09 + A C ATOM 141 CZ PHE A 78 5.996 9.542 6.861 1.00 1.95 + A C ATOM 142 N GLY A 79 3.488 2.749 6.072 1.00 0.31 + A N ATOM 143 CA GLY A 79 3.506 1.359 5.538 1.00 0.32 + A C ATOM 144 C GLY A 79 2.538 1.265 4.358 1.00 0.26 + A C ATOM 145 O GLY A 79 2.873 0.754 3.307 1.00 0.26 + A O ATOM 146 N VAL A 80 1.343 1.772 4.510 1.00 0.23 + A N ATOM 147 CA VAL A 80 0.372 1.725 3.383 1.00 0.22 + A C ATOM 148 C VAL A 80 0.988 2.429 2.174 1.00 0.21 + A C ATOM 149 O VAL A 80 0.984 1.921 1.070 1.00 0.22 + A O ATOM 150 CB VAL A 80 -0.920 2.435 3.795 1.00 0.24 + A C ATOM 151 CG1 VAL A 80 -1.923 2.384 2.642 1.00 0.28 + A C ATOM 152 CG2 VAL A 80 -1.513 1.725 5.015 1.00 0.28 + A C ATOM 153 N MET A 81 1.537 3.594 2.387 1.00 0.22 + A N ATOM 154 CA MET A 81 2.178 4.339 1.269 1.00 0.25 + A C ATOM 155 C MET A 81 3.276 3.473 0.648 1.00 0.27 + A C ATOM 156 O MET A 81 3.375 3.345 -0.557 1.00 0.28 + A O ATOM 157 CB MET A 81 2.785 5.636 1.805 1.00 0.29 + A C ATOM 158 CG MET A 81 1.665 6.623 2.136 1.00 0.36 + A C ATOM 159 SD MET A 81 2.273 8.316 1.947 1.00 1.07 + A S ATOM 160 CE MET A 81 0.687 9.081 1.529 1.00 1.70 + A C ATOM 161 N ALA A 82 4.108 2.887 1.464 1.00 0.28 + A N ATOM 162 CA ALA A 82 5.207 2.039 0.933 1.00 0.32 + A C ATOM 163 C ALA A 82 4.624 0.923 0.059 1.00 0.30 + A C ATOM 164 O ALA A 82 5.097 0.679 -1.033 1.00 0.30 + A O ATOM 165 CB ALA A 82 6.022 1.442 2.092 1.00 0.37 + A C ATOM 166 N GLY A 83 3.597 0.244 0.508 1.00 0.28 + A N ATOM 167 CA GLY A 83 3.013 -0.836 -0.336 1.00 0.27 + A C ATOM 168 C GLY A 83 2.602 -0.239 -1.680 1.00 0.23 + A C ATOM 169 O GLY A 83 2.911 -0.768 -2.730 1.00 0.23 + A O ATOM 170 N VAL A 84 1.916 0.872 -1.654 1.00 0.21 + A N ATOM 171 CA VAL A 84 1.501 1.512 -2.933 1.00 0.19 + A C ATOM 172 C VAL A 84 2.739 1.795 -3.786 1.00 0.20 + A C ATOM 173 O VAL A 84 2.772 1.502 -4.966 1.00 0.19 + A O ATOM 174 CB VAL A 84 0.774 2.826 -2.636 1.00 0.22 + A C ATOM 175 CG1 VAL A 84 0.564 3.598 -3.941 1.00 0.28 + A C ATOM 176 CG2 VAL A 84 -0.587 2.526 -2.000 1.00 0.27 + A C ATOM 177 N ILE A 85 3.760 2.367 -3.200 1.00 0.23 + A N ATOM 178 CA ILE A 85 4.996 2.672 -3.975 1.00 0.27 + A C ATOM 179 C ILE A 85 5.547 1.392 -4.608 1.00 0.27 + A C ATOM 180 O ILE A 85 5.886 1.365 -5.775 1.00 0.28 + A O ATOM 181 CB ILE A 85 6.050 3.266 -3.039 1.00 0.33 + A C ATOM 182 CG1 ILE A 85 5.504 4.550 -2.407 1.00 0.67 + A C ATOM 183 CG2 ILE A 85 7.316 3.588 -3.834 1.00 0.75 + A C ATOM 184 CD1 ILE A 85 5.459 5.661 -3.459 1.00 1.20 + A C ATOM 185 N GLY A 86 5.647 0.329 -3.855 1.00 0.29 + A N ATOM 186 CA GLY A 86 6.184 -0.934 -4.435 1.00 0.33 + A C ATOM 187 C GLY A 86 5.336 -1.337 -5.640 1.00 0.30 + A C ATOM 188 O GLY A 86 5.849 -1.653 -6.693 1.00 0.33 + A O ATOM 189 N THR A 87 4.038 -1.315 -5.496 1.00 0.25 + A N ATOM 190 CA THR A 87 3.164 -1.685 -6.644 1.00 0.25 + A C ATOM 191 C THR A 87 3.479 -0.771 -7.827 1.00 0.26 + A C ATOM 192 O THR A 87 3.649 -1.218 -8.945 1.00 0.32 + A O ATOM 193 CB THR A 87 1.695 -1.518 -6.246 1.00 0.24 + A C ATOM 194 OG1 THR A 87 1.410 -2.352 -5.132 1.00 0.34 + A O ATOM 195 CG2 THR A 87 0.803 -1.908 -7.425 1.00 0.31 + A C ATOM 196 N ILE A 88 3.561 0.510 -7.586 1.00 0.24 + A N ATOM 197 CA ILE A 88 3.868 1.463 -8.687 1.00 0.33 + A C ATOM 198 C ILE A 88 5.195 1.075 -9.340 1.00 0.39 + A C ATOM 199 O ILE A 88 5.308 1.014 -10.549 1.00 0.47 + A O ATOM 200 CB ILE A 88 3.970 2.879 -8.119 1.00 0.37 + A C ATOM 201 CG1 ILE A 88 2.594 3.324 -7.619 1.00 0.39 + A C ATOM 202 CG2 ILE A 88 4.450 3.836 -9.211 1.00 0.47 + A C ATOM 203 CD1 ILE A 88 2.658 4.786 -7.171 1.00 0.49 + A C ATOM 204 N LEU A 89 6.201 0.811 -8.552 1.00 0.39 + A N ATOM 205 CA LEU A 89 7.517 0.428 -9.127 1.00 0.49 + A C ATOM 206 C LEU A 89 7.322 -0.806 -10.021 1.00 0.53 + A C ATOM 207 O LEU A 89 7.795 -0.855 -11.139 1.00 0.62 + A O ATOM 208 CB LEU A 89 8.509 0.152 -7.973 1.00 0.53 + A C ATOM 209 CG LEU A 89 9.175 -1.221 -8.117 1.00 0.59 + A C ATOM 210 CD1 LEU A 89 10.168 -1.191 -9.284 1.00 0.68 + A C ATOM 211 CD2 LEU A 89 9.925 -1.557 -6.825 1.00 0.65 + A C ATOM 212 N LEU A 90 6.621 -1.798 -9.542 1.00 0.49 + A N ATOM 213 CA LEU A 90 6.397 -3.013 -10.380 1.00 0.57 + A C ATOM 214 C LEU A 90 5.728 -2.598 -11.689 1.00 0.60 + A C ATOM 215 O LEU A 90 6.134 -2.995 -12.762 1.00 0.71 + A O ATOM 216 CB LEU A 90 5.489 -3.997 -9.632 1.00 0.56 + A C ATOM 217 CG LEU A 90 5.469 -5.334 -10.375 1.00 1.24 + A C ATOM 218 CD1 LEU A 90 6.764 -6.096 -10.086 1.00 1.76 + A C ATOM 219 CD2 LEU A 90 4.272 -6.164 -9.904 1.00 1.92 + A C ATOM 220 N ILE A 91 4.708 -1.788 -11.608 1.00 0.55 + A N ATOM 221 CA ILE A 91 4.013 -1.329 -12.843 1.00 0.65 + A C ATOM 222 C ILE A 91 5.014 -0.643 -13.770 1.00 0.75 + A C ATOM 223 O ILE A 91 5.058 -0.907 -14.955 1.00 0.87 + A O ATOM 224 CB ILE A 91 2.900 -0.347 -12.475 1.00 0.63 + A C ATOM 225 CG1 ILE A 91 1.883 -1.050 -11.574 1.00 0.58 + A C ATOM 226 CG2 ILE A 91 2.202 0.135 -13.748 1.00 0.75 + A C ATOM 227 CD1 ILE A 91 0.969 -0.007 -10.928 1.00 0.66 + A C ATOM 228 N SER A 92 5.821 0.240 -13.247 1.00 0.73 + A N ATOM 229 CA SER A 92 6.804 0.931 -14.123 1.00 0.87 + A C ATOM 230 C SER A 92 7.674 -0.119 -14.812 1.00 0.94 + A C ATOM 231 O SER A 92 7.877 -0.079 -16.008 1.00 1.07 + A O ATOM 232 CB SER A 92 7.686 1.851 -13.279 1.00 0.86 + A C ATOM 233 OG SER A 92 6.867 2.795 -12.602 1.00 1.29 + A O ATOM 234 N TYR A 93 8.182 -1.070 -14.075 1.00 0.89 + A N ATOM 235 CA TYR A 93 9.018 -2.119 -14.719 1.00 1.00 + A C ATOM 236 C TYR A 93 8.202 -2.810 -15.817 1.00 1.05 + A C ATOM 237 O TYR A 93 8.661 -2.983 -16.930 1.00 1.18 + A O ATOM 238 CB TYR A 93 9.448 -3.150 -13.672 1.00 0.96 + A C ATOM 239 CG TYR A 93 10.382 -4.150 -14.306 1.00 1.27 + A C ATOM 240 CD1 TYR A 93 10.075 -4.697 -15.558 1.00 1.62 + A C ATOM 241 CD2 TYR A 93 11.557 -4.529 -13.646 1.00 2.13 + A C ATOM 242 CE1 TYR A 93 10.942 -5.626 -16.149 1.00 1.95 + A C ATOM 243 CE2 TYR A 93 12.424 -5.455 -14.237 1.00 2.53 + A C ATOM 244 CZ TYR A 93 12.117 -6.003 -15.487 1.00 2.15 + A C ATOM 245 OH TYR A 93 12.973 -6.917 -16.071 1.00 2.64 + A O ATOM 246 N GLY A 94 6.988 -3.200 -15.516 1.00 0.97 + A N ATOM 247 CA GLY A 94 6.140 -3.866 -16.542 1.00 1.05 + A C ATOM 248 C GLY A 94 5.987 -2.949 -17.753 1.00 1.16 + A C ATOM 249 O GLY A 94 6.148 -3.362 -18.886 1.00 1.29 + A O ATOM 250 N ILE A 95 5.676 -1.701 -17.524 1.00 1.14 + A N ATOM 251 CA ILE A 95 5.510 -0.752 -18.665 1.00 1.29 + A C ATOM 252 C ILE A 95 6.801 -0.728 -19.492 1.00 1.41 + A C ATOM 253 O ILE A 95 6.775 -0.812 -20.703 1.00 1.55 + A O ATOM 254 CB ILE A 95 5.151 0.656 -18.138 1.00 1.28 + A C ATOM 255 CG1 ILE A 95 3.779 0.599 -17.461 1.00 1.74 + A C ATOM 256 CG2 ILE A 95 5.104 1.681 -19.280 1.00 1.77 + A C ATOM 257 CD1 ILE A 95 3.444 1.971 -16.868 1.00 2.19 + A C ATOM 258 N ARG A 96 7.931 -0.618 -18.843 1.00 1.37 + A N ATOM 259 CA ARG A 96 9.221 -0.591 -19.590 1.00 1.51 + A C ATOM 260 C ARG A 96 9.332 -1.851 -20.445 1.00 1.58 + A C ATOM 261 O ARG A 96 9.672 -1.801 -21.608 1.00 1.73 + A O ATOM 262 CB ARG A 96 10.388 -0.536 -18.598 1.00 1.46 + A C ATOM 263 CG ARG A 96 11.710 -0.519 -19.367 1.00 1.73 + A C ATOM 264 CD ARG A 96 11.774 0.732 -20.250 1.00 2.03 + A C ATOM 265 NE ARG A 96 13.061 0.742 -21.007 1.00 2.00 + A N ATOM 266 CZ ARG A 96 13.335 1.731 -21.813 1.00 2.45 + A C ATOM 267 NH1 ARG A 96 12.469 2.694 -21.983 1.00 3.14 + A N ATOM 268 NH2 ARG A 96 14.475 1.757 -22.448 1.00 2.76 + A N ATOM 269 N ARG A 97 9.046 -2.988 -19.886 1.00 1.50 + A N ATOM 270 CA ARG A 97 9.132 -4.234 -20.687 1.00 1.59 + A C ATOM 271 C ARG A 97 8.225 -4.110 -21.911 1.00 1.69 + A C ATOM 272 O ARG A 97 8.616 -4.419 -23.020 1.00 1.84 + A O ATOM 273 CB ARG A 97 8.698 -5.431 -19.837 1.00 1.50 + A C ATOM 274 CG ARG A 97 8.729 -6.702 -20.687 1.00 1.77 + A C ATOM 275 CD ARG A 97 10.167 -6.987 -21.129 1.00 2.23 + A C ATOM 276 NE ARG A 97 10.193 -8.174 -22.031 1.00 2.70 + A N ATOM 277 CZ ARG A 97 11.320 -8.587 -22.543 1.00 3.07 + A C ATOM 278 NH1 ARG A 97 12.427 -7.943 -22.289 1.00 3.56 + A N ATOM 279 NH2 ARG A 97 11.340 -9.641 -23.308 1.00 3.58 + A N ATOM 280 N LEU A 98 7.015 -3.660 -21.722 1.00 1.64 + A N ATOM 281 CA LEU A 98 6.083 -3.519 -22.878 1.00 1.75 + A C ATOM 282 C LEU A 98 6.715 -2.612 -23.938 1.00 1.91 + A C ATOM 283 O LEU A 98 6.719 -2.926 -25.113 1.00 2.05 + A O ATOM 284 CB LEU A 98 4.766 -2.900 -22.400 1.00 1.69 + A C ATOM 285 CG LEU A 98 4.063 -3.870 -21.447 1.00 2.11 + A C ATOM 286 CD1 LEU A 98 3.102 -3.090 -20.546 1.00 2.66 + A C ATOM 287 CD2 LEU A 98 3.279 -4.903 -22.258 1.00 2.38 + A C ATOM 288 N ILE A 99 7.251 -1.492 -23.536 1.00 1.90 + A N ATOM 289 CA ILE A 99 7.882 -0.574 -24.525 1.00 2.06 + A C ATOM 290 C ILE A 99 9.313 -1.053 -24.814 1.00 2.14 + A C ATOM 291 O ILE A 99 9.510 -2.109 -25.383 1.00 2.20 + A O ATOM 292 CB ILE A 99 7.898 0.849 -23.959 1.00 2.05 + A C ATOM 293 CG1 ILE A 99 6.467 1.290 -23.641 1.00 1.98 + A C ATOM 294 CG2 ILE A 99 8.504 1.805 -24.990 1.00 2.03 + A C ATOM 295 CD1 ILE A 99 6.471 2.199 -22.415 1.00 2.00 + A C ATOM 296 N LYS A 100 10.311 -0.303 -24.427 1.00 2.35 + A N ATOM 297 CA LYS A 100 11.711 -0.743 -24.687 1.00 2.45 + A C ATOM 298 C LYS A 100 11.846 -1.196 -26.141 1.00 2.85 + A C ATOM 299 O LYS A 100 10.926 -1.082 -26.925 1.00 3.21 + A O ATOM 300 CB LYS A 100 12.059 -1.904 -23.752 1.00 2.30 + A C ATOM 301 CG LYS A 100 13.539 -2.258 -23.905 1.00 2.60 + A C ATOM 302 CD LYS A 100 13.875 -3.454 -23.008 1.00 2.94 + A C ATOM 303 CE LYS A 100 15.321 -3.880 -23.246 1.00 3.56 + A C ATOM 304 NZ LYS A 100 15.659 -5.016 -22.342 1.00 3.91 + A N ATOM 305 N LYS A 101 12.989 -1.712 -26.508 1.00 3.09 + A N ATOM 306 CA LYS A 101 13.180 -2.173 -27.913 1.00 3.63 + A C ATOM 307 C LYS A 101 13.147 -0.966 -28.856 1.00 4.10 + A C ATOM 308 O LYS A 101 13.338 -1.165 -30.044 1.00 4.56 + A O ATOM 309 CB LYS A 101 12.060 -3.142 -28.288 1.00 3.99 + A C ATOM 310 CG LYS A 101 12.512 -4.017 -29.459 1.00 4.58 + A C ATOM 311 CD LYS A 101 11.355 -4.192 -30.443 1.00 5.25 + A C ATOM 312 CE LYS A 101 11.905 -4.596 -31.812 1.00 6.09 + A C ATOM 313 NZ LYS A 101 12.682 -5.862 -31.679 1.00 6.53 + A N ATOM 314 OXT LYS A 101 12.929 0.132 -28.372 1.00 4.44 + A O ATOM 345 N HIS B 66 -10.725 2.701 13.152 1.00 13.65 + B N ATOM 346 CA HIS B 66 -10.036 1.822 14.141 1.00 12.91 + B C ATOM 347 C HIS B 66 -10.950 0.653 14.519 1.00 12.30 + B C ATOM 348 O HIS B 66 -11.988 0.833 15.123 1.00 12.38 + B O ATOM 349 CB HIS B 66 -9.699 2.628 15.397 1.00 12.96 + B C ATOM 350 CG HIS B 66 -8.323 2.255 15.881 1.00 13.34 + B C ATOM 351 ND1 HIS B 66 -7.174 2.829 15.357 1.00 13.54 + B N ATOM 352 CD2 HIS B 66 -7.895 1.370 16.840 1.00 13.74 + B C ATOM 353 CE1 HIS B 66 -6.123 2.289 15.998 1.00 14.03 + B C ATOM 354 NE2 HIS B 66 -6.506 1.393 16.912 1.00 14.18 + B N ATOM 355 N HIS B 67 -10.570 -0.544 14.166 1.00 11.87 + B N ATOM 356 CA HIS B 67 -11.414 -1.726 14.502 1.00 11.46 + B C ATOM 357 C HIS B 67 -10.910 -2.949 13.737 1.00 10.55 + B C ATOM 358 O HIS B 67 -11.682 -3.768 13.278 1.00 10.53 + B O ATOM 359 CB HIS B 67 -12.866 -1.445 14.116 1.00 11.84 + B C ATOM 360 CG HIS B 67 -13.682 -1.212 15.358 1.00 12.35 + B C ATOM 361 ND1 HIS B 67 -14.413 -0.052 15.554 1.00 12.68 + B N ATOM 362 CD2 HIS B 67 -13.889 -1.980 16.476 1.00 12.80 + B C ATOM 363 CE1 HIS B 67 -15.022 -0.153 16.749 1.00 13.29 + B C ATOM 364 NE2 HIS B 67 -14.736 -1.310 17.354 1.00 13.39 + B N ATOM 365 N PHE B 68 -9.619 -3.082 13.594 1.00 10.00 + B N ATOM 366 CA PHE B 68 -9.068 -4.253 12.856 1.00 9.29 + B C ATOM 367 C PHE B 68 -7.788 -4.733 13.542 1.00 8.27 + B C ATOM 368 O PHE B 68 -7.818 -5.593 14.400 1.00 8.29 + B O ATOM 369 CB PHE B 68 -8.753 -3.843 11.415 1.00 9.52 + B C ATOM 370 CG PHE B 68 -9.856 -4.323 10.503 1.00 10.43 + B C ATOM 371 CD1 PHE B 68 -11.115 -3.716 10.547 1.00 10.86 + B C ATOM 372 CD2 PHE B 68 -9.616 -5.377 9.613 1.00 11.07 + B C ATOM 373 CE1 PHE B 68 -12.137 -4.162 9.699 1.00 11.86 + B C ATOM 374 CE2 PHE B 68 -10.639 -5.822 8.764 1.00 12.06 + B C ATOM 375 CZ PHE B 68 -11.899 -5.215 8.808 1.00 12.43 + B C ATOM 376 N SER B 69 -6.664 -4.184 13.172 1.00 7.61 + B N ATOM 377 CA SER B 69 -5.383 -4.610 13.805 1.00 6.80 + B C ATOM 378 C SER B 69 -5.132 -3.775 15.062 1.00 5.60 + B C ATOM 379 O SER B 69 -4.777 -2.614 14.987 1.00 5.39 + B O ATOM 380 CB SER B 69 -4.235 -4.409 12.818 1.00 7.36 + B C ATOM 381 OG SER B 69 -4.382 -5.316 11.733 1.00 8.12 + B O ATOM 382 N GLU B 70 -5.314 -4.354 16.216 1.00 5.12 + B N ATOM 383 CA GLU B 70 -5.088 -3.595 17.478 1.00 4.22 + B C ATOM 384 C GLU B 70 -3.595 -3.285 17.641 1.00 3.04 + B C ATOM 385 O GLU B 70 -3.220 -2.141 17.803 1.00 3.18 + B O ATOM 386 CB GLU B 70 -5.572 -4.425 18.669 1.00 4.88 + B C ATOM 387 CG GLU B 70 -6.895 -3.853 19.183 1.00 5.77 + B C ATOM 388 CD GLU B 70 -8.063 -4.588 18.523 1.00 6.71 + B C ATOM 389 OE1 GLU B 70 -8.099 -5.803 18.613 1.00 7.29 + B O ATOM 390 OE2 GLU B 70 -8.903 -3.920 17.940 1.00 7.09 + B O ATOM 391 N PRO B 71 -2.782 -4.314 17.600 1.00 2.28 + B N ATOM 392 CA PRO B 71 -1.322 -4.162 17.747 1.00 1.48 + B C ATOM 393 C PRO B 71 -0.761 -3.262 16.642 1.00 1.33 + B C ATOM 394 O PRO B 71 -1.346 -3.116 15.587 1.00 1.29 + B O ATOM 395 CB PRO B 71 -0.763 -5.587 17.622 1.00 2.00 + B C ATOM 396 CG PRO B 71 -1.963 -6.549 17.434 1.00 2.71 + B C ATOM 397 CD PRO B 71 -3.245 -5.700 17.403 1.00 2.87 + B C ATOM 398 N GLU B 72 0.373 -2.658 16.879 1.00 1.30 + B N ATOM 399 CA GLU B 72 0.975 -1.766 15.848 1.00 1.19 + B C ATOM 400 C GLU B 72 1.393 -2.593 14.630 1.00 1.06 + B C ATOM 401 O GLU B 72 0.765 -2.543 13.592 1.00 1.01 + B O ATOM 402 CB GLU B 72 2.206 -1.071 16.431 1.00 1.29 + B C ATOM 403 CG GLU B 72 1.789 -0.202 17.619 1.00 1.90 + B C ATOM 404 CD GLU B 72 2.684 -0.519 18.819 1.00 2.25 + B C ATOM 405 OE1 GLU B 72 2.687 -1.662 19.245 1.00 2.89 + B O ATOM 406 OE2 GLU B 72 3.349 0.387 19.290 1.00 2.52 + B O ATOM 407 N ILE B 73 2.456 -3.346 14.748 1.00 1.07 + B N ATOM 408 CA ILE B 73 2.930 -4.169 13.602 1.00 0.98 + B C ATOM 409 C ILE B 73 1.736 -4.780 12.861 1.00 0.89 + B C ATOM 410 O ILE B 73 1.702 -4.819 11.647 1.00 0.82 + B O ATOM 411 CB ILE B 73 3.841 -5.282 14.129 1.00 1.09 + B C ATOM 412 CG1 ILE B 73 4.696 -5.806 12.981 1.00 1.29 + B C ATOM 413 CG2 ILE B 73 3.002 -6.427 14.707 1.00 1.27 + B C ATOM 414 CD1 ILE B 73 6.170 -5.541 13.287 1.00 2.11 + B C ATOM 415 N THR B 74 0.758 -5.256 13.580 1.00 0.92 + B N ATOM 416 CA THR B 74 -0.431 -5.858 12.915 1.00 0.87 + B C ATOM 417 C THR B 74 -1.122 -4.800 12.051 1.00 0.77 + B C ATOM 418 O THR B 74 -1.476 -5.048 10.913 1.00 0.69 + B O ATOM 419 CB THR B 74 -1.407 -6.386 13.973 1.00 0.99 + B C ATOM 420 OG1 THR B 74 -1.442 -5.497 15.083 1.00 1.57 + B O ATOM 421 CG2 THR B 74 -0.954 -7.769 14.440 1.00 1.43 + B C ATOM 422 N LEU B 75 -1.329 -3.624 12.577 1.00 0.81 + B N ATOM 423 CA LEU B 75 -2.004 -2.571 11.765 1.00 0.74 + B C ATOM 424 C LEU B 75 -1.042 -2.027 10.710 1.00 0.62 + B C ATOM 425 O LEU B 75 -1.398 -1.895 9.558 1.00 0.52 + B O ATOM 426 CB LEU B 75 -2.460 -1.423 12.674 1.00 0.87 + B C ATOM 427 CG LEU B 75 -3.786 -0.854 12.164 1.00 1.11 + B C ATOM 428 CD1 LEU B 75 -4.041 0.507 12.817 1.00 1.02 + B C ATOM 429 CD2 LEU B 75 -3.730 -0.680 10.643 1.00 1.63 + B C ATOM 430 N ILE B 76 0.168 -1.698 11.078 1.00 0.66 + B N ATOM 431 CA ILE B 76 1.105 -1.157 10.056 1.00 0.59 + B C ATOM 432 C ILE B 76 1.261 -2.172 8.919 1.00 0.48 + B C ATOM 433 O ILE B 76 1.135 -1.833 7.757 1.00 0.39 + B O ATOM 434 CB ILE B 76 2.472 -0.899 10.696 1.00 0.70 + B C ATOM 435 CG1 ILE B 76 2.311 0.070 11.868 1.00 0.89 + B C ATOM 436 CG2 ILE B 76 3.414 -0.292 9.655 1.00 0.73 + B C ATOM 437 CD1 ILE B 76 3.576 0.042 12.728 1.00 1.02 + B C ATOM 438 N ILE B 77 1.501 -3.417 9.237 1.00 0.51 + B N ATOM 439 CA ILE B 77 1.623 -4.439 8.159 1.00 0.45 + B C ATOM 440 C ILE B 77 0.328 -4.451 7.349 1.00 0.35 + B C ATOM 441 O ILE B 77 0.336 -4.430 6.133 1.00 0.30 + B O ATOM 442 CB ILE B 77 1.892 -5.820 8.767 1.00 0.52 + B C ATOM 443 CG1 ILE B 77 2.302 -6.791 7.659 1.00 1.19 + B C ATOM 444 CG2 ILE B 77 0.637 -6.345 9.462 1.00 1.41 + B C ATOM 445 CD1 ILE B 77 3.696 -7.344 7.957 1.00 1.86 + B C ATOM 446 N PHE B 78 -0.789 -4.477 8.027 1.00 0.37 + B N ATOM 447 CA PHE B 78 -2.103 -4.487 7.318 1.00 0.34 + B C ATOM 448 C PHE B 78 -2.201 -3.263 6.413 1.00 0.29 + B C ATOM 449 O PHE B 78 -2.590 -3.358 5.265 1.00 0.27 + B O ATOM 450 CB PHE B 78 -3.253 -4.460 8.333 1.00 0.44 + B C ATOM 451 CG PHE B 78 -3.822 -5.850 8.479 1.00 1.05 + B C ATOM 452 CD1 PHE B 78 -3.093 -6.837 9.156 1.00 1.80 + B C ATOM 453 CD2 PHE B 78 -5.076 -6.155 7.938 1.00 1.97 + B C ATOM 454 CE1 PHE B 78 -3.621 -8.128 9.291 1.00 2.57 + B C ATOM 455 CE2 PHE B 78 -5.604 -7.446 8.075 1.00 2.71 + B C ATOM 456 CZ PHE B 78 -4.875 -8.432 8.751 1.00 2.81 + B C ATOM 457 N GLY B 79 -1.859 -2.109 6.915 1.00 0.30 + B N ATOM 458 CA GLY B 79 -1.943 -0.879 6.079 1.00 0.28 + B C ATOM 459 C GLY B 79 -1.104 -1.070 4.817 1.00 0.22 + B C ATOM 460 O GLY B 79 -1.550 -0.792 3.722 1.00 0.23 + B O ATOM 461 N VAL B 80 0.104 -1.550 4.950 1.00 0.22 + B N ATOM 462 CA VAL B 80 0.942 -1.755 3.736 1.00 0.23 + B C ATOM 463 C VAL B 80 0.195 -2.684 2.776 1.00 0.22 + B C ATOM 464 O VAL B 80 0.059 -2.403 1.600 1.00 0.23 + B O ATOM 465 CB VAL B 80 2.271 -2.387 4.145 1.00 0.27 + B C ATOM 466 CG1 VAL B 80 3.116 -2.654 2.896 1.00 0.33 + B C ATOM 467 CG2 VAL B 80 3.021 -1.426 5.071 1.00 0.30 + B C ATOM 468 N MET B 81 -0.323 -3.775 3.276 1.00 0.22 + B N ATOM 469 CA MET B 81 -1.093 -4.701 2.401 1.00 0.23 + B C ATOM 470 C MET B 81 -2.254 -3.927 1.772 1.00 0.24 + B C ATOM 471 O MET B 81 -2.497 -3.993 0.583 1.00 0.25 + B O ATOM 472 CB MET B 81 -1.641 -5.858 3.238 1.00 0.26 + B C ATOM 473 CG MET B 81 -0.477 -6.620 3.877 1.00 0.32 + B C ATOM 474 SD MET B 81 -1.054 -8.242 4.435 1.00 1.15 + B S ATOM 475 CE MET B 81 -2.279 -7.655 5.630 1.00 1.83 + B C ATOM 476 N ALA B 82 -2.967 -3.188 2.581 1.00 0.25 + B N ATOM 477 CA ALA B 82 -4.115 -2.392 2.068 1.00 0.27 + B C ATOM 478 C ALA B 82 -3.638 -1.442 0.963 1.00 0.26 + B C ATOM 479 O ALA B 82 -4.262 -1.330 -0.073 1.00 0.27 + B O ATOM 480 CB ALA B 82 -4.760 -1.600 3.215 1.00 0.31 + B C ATOM 481 N GLY B 83 -2.546 -0.745 1.170 1.00 0.25 + B N ATOM 482 CA GLY B 83 -2.072 0.195 0.118 1.00 0.26 + B C ATOM 483 C GLY B 83 -1.867 -0.572 -1.186 1.00 0.24 + B C ATOM 484 O GLY B 83 -2.310 -0.150 -2.235 1.00 0.24 + B O ATOM 485 N VAL B 84 -1.212 -1.699 -1.139 1.00 0.22 + B N ATOM 486 CA VAL B 84 -1.009 -2.475 -2.397 1.00 0.21 + B C ATOM 487 C VAL B 84 -2.378 -2.782 -3.017 1.00 0.21 + B C ATOM 488 O VAL B 84 -2.595 -2.581 -4.197 1.00 0.21 + B O ATOM 489 CB VAL B 84 -0.285 -3.784 -2.087 1.00 0.23 + B C ATOM 490 CG1 VAL B 84 -0.067 -4.566 -3.383 1.00 0.28 + B C ATOM 491 CG2 VAL B 84 1.070 -3.476 -1.444 1.00 0.31 + B C ATOM 492 N ILE B 85 -3.305 -3.257 -2.227 1.00 0.23 + B N ATOM 493 CA ILE B 85 -4.662 -3.568 -2.763 1.00 0.25 + B C ATOM 494 C ILE B 85 -5.271 -2.314 -3.399 1.00 0.26 + B C ATOM 495 O ILE B 85 -5.802 -2.358 -4.489 1.00 0.26 + B O ATOM 496 CB ILE B 85 -5.560 -4.052 -1.623 1.00 0.30 + B C ATOM 497 CG1 ILE B 85 -5.065 -5.411 -1.127 1.00 0.32 + B C ATOM 498 CG2 ILE B 85 -6.997 -4.185 -2.129 1.00 0.40 + B C ATOM 499 CD1 ILE B 85 -5.532 -5.633 0.312 1.00 1.06 + B C ATOM 500 N GLY B 86 -5.206 -1.200 -2.719 1.00 0.27 + B N ATOM 501 CA GLY B 86 -5.791 0.051 -3.281 1.00 0.30 + B C ATOM 502 C GLY B 86 -5.162 0.349 -4.642 1.00 0.27 + B C ATOM 503 O GLY B 86 -5.849 0.643 -5.600 1.00 0.29 + B O ATOM 504 N THR B 87 -3.862 0.275 -4.741 1.00 0.24 + B N ATOM 505 CA THR B 87 -3.207 0.554 -6.047 1.00 0.24 + B C ATOM 506 C THR B 87 -3.780 -0.390 -7.108 1.00 0.24 + B C ATOM 507 O THR B 87 -4.145 0.027 -8.191 1.00 0.28 + B O ATOM 508 CB THR B 87 -1.697 0.331 -5.925 1.00 0.25 + B C ATOM 509 OG1 THR B 87 -1.204 1.051 -4.804 1.00 0.32 + B O ATOM 510 CG2 THR B 87 -1.006 0.823 -7.198 1.00 0.28 + B C ATOM 511 N ILE B 88 -3.871 -1.656 -6.801 1.00 0.23 + B N ATOM 512 CA ILE B 88 -4.433 -2.617 -7.791 1.00 0.29 + B C ATOM 513 C ILE B 88 -5.841 -2.166 -8.188 1.00 0.33 + B C ATOM 514 O ILE B 88 -6.184 -2.117 -9.352 1.00 0.40 + B O ATOM 515 CB ILE B 88 -4.499 -4.013 -7.167 1.00 0.33 + B C ATOM 516 CG1 ILE B 88 -3.082 -4.578 -7.038 1.00 0.34 + B C ATOM 517 CG2 ILE B 88 -5.333 -4.932 -8.059 1.00 0.40 + B C ATOM 518 CD1 ILE B 88 -3.100 -5.798 -6.114 1.00 0.41 + B C ATOM 519 N LEU B 89 -6.654 -1.829 -7.225 1.00 0.33 + B N ATOM 520 CA LEU B 89 -8.037 -1.372 -7.533 1.00 0.41 + B C ATOM 521 C LEU B 89 -7.963 -0.162 -8.476 1.00 0.44 + B C ATOM 522 O LEU B 89 -8.667 -0.089 -9.464 1.00 0.51 + B O ATOM 523 CB LEU B 89 -8.757 -1.028 -6.210 1.00 0.44 + B C ATOM 524 CG LEU B 89 -9.339 0.392 -6.232 1.00 0.76 + B C ATOM 525 CD1 LEU B 89 -10.556 0.436 -7.156 1.00 1.14 + B C ATOM 526 CD2 LEU B 89 -9.761 0.789 -4.816 1.00 1.06 + B C ATOM 527 N LEU B 90 -7.122 0.789 -8.172 1.00 0.40 + B N ATOM 528 CA LEU B 90 -7.009 1.991 -9.048 1.00 0.47 + B C ATOM 529 C LEU B 90 -6.666 1.556 -10.473 1.00 0.51 + B C ATOM 530 O LEU B 90 -7.269 2.002 -11.430 1.00 0.60 + B O ATOM 531 CB LEU B 90 -5.906 2.911 -8.516 1.00 0.47 + B C ATOM 532 CG LEU B 90 -5.953 4.248 -9.259 1.00 1.14 + B C ATOM 533 CD1 LEU B 90 -7.336 4.880 -9.094 1.00 2.02 + B C ATOM 534 CD2 LEU B 90 -4.897 5.191 -8.680 1.00 1.67 + B C ATOM 535 N ILE B 91 -5.705 0.686 -10.630 1.00 0.47 + B N ATOM 536 CA ILE B 91 -5.337 0.226 -11.998 1.00 0.57 + B C ATOM 537 C ILE B 91 -6.564 -0.365 -12.684 1.00 0.65 + B C ATOM 538 O ILE B 91 -6.864 -0.046 -13.815 1.00 0.77 + B O ATOM 539 CB ILE B 91 -4.238 -0.832 -11.905 1.00 0.55 + B C ATOM 540 CG1 ILE B 91 -2.979 -0.209 -11.300 1.00 0.51 + B C ATOM 541 CG2 ILE B 91 -3.921 -1.366 -13.302 1.00 0.66 + B C ATOM 542 CD1 ILE B 91 -2.013 -1.317 -10.874 1.00 0.54 + B C ATOM 543 N SER B 92 -7.285 -1.221 -12.011 1.00 0.63 + B N ATOM 544 CA SER B 92 -8.491 -1.807 -12.649 1.00 0.76 + B C ATOM 545 C SER B 92 -9.415 -0.668 -13.068 1.00 0.83 + B C ATOM 546 O SER B 92 -9.898 -0.620 -14.182 1.00 0.95 + B O ATOM 547 CB SER B 92 -9.216 -2.714 -11.651 1.00 0.74 + B C ATOM 548 OG SER B 92 -8.356 -2.977 -10.548 1.00 1.53 + B O ATOM 549 N TYR B 93 -9.649 0.266 -12.187 1.00 0.77 + B N ATOM 550 CA TYR B 93 -10.522 1.415 -12.538 1.00 0.87 + B C ATOM 551 C TYR B 93 -9.949 2.130 -13.766 1.00 0.94 + B C ATOM 552 O TYR B 93 -10.659 2.424 -14.708 1.00 1.07 + B O ATOM 553 CB TYR B 93 -10.581 2.390 -11.359 1.00 0.83 + B C ATOM 554 CG TYR B 93 -11.419 3.587 -11.739 1.00 1.09 + B C ATOM 555 CD1 TYR B 93 -12.800 3.446 -11.923 1.00 1.68 + B C ATOM 556 CD2 TYR B 93 -10.813 4.838 -11.903 1.00 1.74 + B C ATOM 557 CE1 TYR B 93 -13.574 4.559 -12.271 1.00 2.03 + B C ATOM 558 CE2 TYR B 93 -11.588 5.950 -12.253 1.00 1.99 + B C ATOM 559 CZ TYR B 93 -12.969 5.811 -12.436 1.00 1.84 + B C ATOM 560 OH TYR B 93 -13.735 6.907 -12.781 1.00 2.25 + B O ATOM 561 N GLY B 94 -8.671 2.404 -13.767 1.00 0.88 + B N ATOM 562 CA GLY B 94 -8.058 3.094 -14.938 1.00 0.98 + B C ATOM 563 C GLY B 94 -8.314 2.268 -16.198 1.00 1.10 + B C ATOM 564 O GLY B 94 -8.734 2.784 -17.215 1.00 1.24 + B O ATOM 565 N ILE B 95 -8.074 0.988 -16.135 1.00 1.07 + B N ATOM 566 CA ILE B 95 -8.317 0.133 -17.332 1.00 1.22 + B C ATOM 567 C ILE B 95 -9.781 0.282 -17.758 1.00 1.34 + B C ATOM 568 O ILE B 95 -10.082 0.472 -18.918 1.00 1.50 + B O ATOM 569 CB ILE B 95 -8.008 -1.334 -16.994 1.00 1.20 + B C ATOM 570 CG1 ILE B 95 -6.498 -1.567 -17.098 1.00 1.37 + B C ATOM 571 CG2 ILE B 95 -8.725 -2.266 -17.976 1.00 1.50 + B C ATOM 572 CD1 ILE B 95 -5.865 -1.466 -15.711 1.00 1.81 + B C ATOM 573 N ARG B 96 -10.690 0.197 -16.825 1.00 1.29 + B N ATOM 574 CA ARG B 96 -12.136 0.338 -17.174 1.00 1.42 + B C ATOM 575 C ARG B 96 -12.369 1.681 -17.865 1.00 1.52 + B C ATOM 576 O ARG B 96 -13.026 1.753 -18.885 1.00 1.68 + B O ATOM 577 CB ARG B 96 -12.990 0.259 -15.904 1.00 1.36 + B C ATOM 578 CG ARG B 96 -14.464 0.431 -16.269 1.00 1.61 + B C ATOM 579 CD ARG B 96 -14.889 -0.683 -17.229 1.00 1.77 + B C ATOM 580 NE ARG B 96 -16.316 -0.490 -17.614 1.00 1.70 + B N ATOM 581 CZ ARG B 96 -16.886 -1.329 -18.436 1.00 2.10 + B C ATOM 582 NH1 ARG B 96 -16.274 -2.432 -18.774 1.00 2.69 + B N ATOM 583 NH2 ARG B 96 -18.070 -1.069 -18.916 1.00 2.54 + B N ATOM 584 N ARG B 97 -11.840 2.749 -17.329 1.00 1.43 + B N ATOM 585 CA ARG B 97 -12.046 4.068 -17.986 1.00 1.55 + B C ATOM 586 C ARG B 97 -11.542 3.958 -19.420 1.00 1.67 + B C ATOM 587 O ARG B 97 -12.217 4.327 -20.360 1.00 1.83 + B O ATOM 588 CB ARG B 97 -11.257 5.147 -17.238 1.00 1.45 + B C ATOM 589 CG ARG B 97 -11.506 6.507 -17.891 1.00 1.62 + B C ATOM 590 CD ARG B 97 -12.997 6.846 -17.817 1.00 1.72 + B C ATOM 591 NE ARG B 97 -13.258 8.124 -18.536 1.00 2.17 + B N ATOM 592 CZ ARG B 97 -14.471 8.607 -18.599 1.00 2.50 + B C ATOM 593 NH1 ARG B 97 -14.697 9.727 -19.230 1.00 3.17 + B N ATOM 594 NH2 ARG B 97 -15.457 7.968 -18.031 1.00 2.89 + B N ATOM 595 N LEU B 98 -10.372 3.410 -19.596 1.00 1.62 + B N ATOM 596 CA LEU B 98 -9.838 3.224 -20.968 1.00 1.76 + B C ATOM 597 C LEU B 98 -10.847 2.387 -21.754 1.00 1.92 + B C ATOM 598 O LEU B 98 -11.183 2.698 -22.876 1.00 2.09 + B O ATOM 599 CB LEU B 98 -8.499 2.488 -20.904 1.00 1.69 + B C ATOM 600 CG LEU B 98 -7.456 3.378 -20.230 1.00 1.72 + B C ATOM 601 CD1 LEU B 98 -6.300 2.514 -19.721 1.00 1.72 + B C ATOM 602 CD2 LEU B 98 -6.927 4.396 -21.241 1.00 2.01 + B C ATOM 603 N ILE B 99 -11.311 1.322 -21.144 1.00 1.89 + B N ATOM 604 CA ILE B 99 -12.301 0.398 -21.779 1.00 2.07 + B C ATOM 605 C ILE B 99 -11.559 -0.755 -22.449 1.00 2.17 + B C ATOM 606 O ILE B 99 -10.995 -0.619 -23.516 1.00 2.50 + B O ATOM 607 CB ILE B 99 -13.168 1.136 -22.798 1.00 2.14 + B C ATOM 608 CG1 ILE B 99 -13.786 2.363 -22.124 1.00 2.13 + B C ATOM 609 CG2 ILE B 99 -14.281 0.207 -23.281 1.00 2.25 + B C ATOM 610 CD1 ILE B 99 -13.583 3.596 -23.005 1.00 2.38 + B C ATOM 611 N LYS B 100 -11.549 -1.894 -21.809 1.00 2.25 + B N ATOM 612 CA LYS B 100 -10.840 -3.073 -22.378 1.00 2.60 + B C ATOM 613 C LYS B 100 -11.762 -3.804 -23.357 1.00 2.97 + B C ATOM 614 O LYS B 100 -12.864 -4.185 -23.020 1.00 3.27 + B O ATOM 615 CB LYS B 100 -10.447 -4.019 -21.240 1.00 2.80 + B C ATOM 616 CG LYS B 100 -9.148 -4.742 -21.599 1.00 3.38 + B C ATOM 617 CD LYS B 100 -8.868 -5.835 -20.564 1.00 3.94 + B C ATOM 618 CE LYS B 100 -8.754 -7.188 -21.269 1.00 4.56 + B C ATOM 619 NZ LYS B 100 -8.755 -8.278 -20.251 1.00 5.22 + B N ATOM 620 N LYS B 101 -11.317 -3.996 -24.571 1.00 3.27 + B N ATOM 621 CA LYS B 101 -12.165 -4.699 -25.578 1.00 3.83 + B C ATOM 622 C LYS B 101 -13.408 -3.856 -25.871 1.00 4.00 + B C ATOM 623 O LYS B 101 -14.436 -4.441 -26.180 1.00 4.36 + B O ATOM 624 CB LYS B 101 -12.597 -6.069 -25.037 1.00 4.43 + B C ATOM 625 CG LYS B 101 -11.369 -6.952 -24.794 1.00 5.03 + B C ATOM 626 CD LYS B 101 -10.711 -7.301 -26.132 1.00 5.77 + B C ATOM 627 CE LYS B 101 -9.211 -7.515 -25.929 1.00 6.68 + B C ATOM 628 NZ LYS B 101 -8.981 -8.833 -25.272 1.00 7.38 + B N ATOM 629 OXT LYS B 101 -13.313 -2.643 -25.782 1.00 4.28 + B O


Let me tell you now the important info for my case, taking the last line as example: The important characters in those lines for me are the A/B character (the chains), the number next to them (66) and the 3rd number after that (16.840).
I can get this info through pattern matching, with the following code:
while($_=~/ATOM\s+\d+\s+.*?\s+.*?\s+(\w{1})\s+(\d+)\s+[\d\.\-]+\s+ +[\d\.\-]+\s+([\d\.\-]+)\s+.*/mg) { $chain=$1; $position=$2; $Zcoordinate=$3; }

BUT, what I would need is somehow to be able to keep the info for each chain separately (in this example A and B, but it can be more letters) and also, for each chain only the $Zcoordinate for the smallest position (in this example 66). I was thinking that the idea would be to somehow create an array of $Zcoordinates for each chain and then get the smallest of the positions, but I don't know how to do that (since I can't know beforehand which are the chain numbers...)

Replies are listed 'Best First'.
Re: How to select specific lines from a file
by davido (Cardinal) on Apr 29, 2014 at 17:29 UTC

    I love it. "Taking the last line..." (which is line 570 of the sample input): 66 occurs in the 6th column of the first row and the 293rd row, 16.840 occurs in the 9th column of line 293 from the sample input, and 'B' occurs on the last row and the 293rd, but not on the first. But a little searching through the sample input resolved any ambiguity.

    Anyway, this is fixed-width stuff, so you should probably be thinking in terms of unpack or substr rather than regular expressions:

    use strict; use warnings; while ( <DATA> ) { next unless /^ATOM\b/; my $chain = substr $_, 21, 1; my $position = 0 + substr $_, 23, 3; my $Zcoordinate = 0 + substr $_, 47, 7; print "$chain, $position, $Zcoordinate\n"; } __DATA__ ATOM 30 N HIS A 66 7.514 15.296 11.222 1.00 12.98 + A N ATOM 31 CA HIS A 66 7.318 14.688 12.568 1.00 12.48 + A C ATOM 32 C HIS A 66 8.676 14.309 13.156 1.00 11.62 + A C ATOM 33 O HIS A 66 9.708 14.518 12.545 1.00 11.76 + A O

    Update:

    Using unpack is a more computationally efficient alternative, though from a programmer standpoint it always takes me longer to work out the template, which is why I posted the substr solution first. Now that I've had time to work out the template for the unpack solution, here it is:

    while ( <DATA> ) { next unless /^ATOM\b/; my( $chain, $position, $Zcoordinate ) = unpack( 'x21a1xA3x21A7',$_); print "$chain, $position, $Zcoordinate\n"; } __DATA__ ATOM 30 N HIS A 66 7.514 15.296 11.222 1.00 12.98 + A N ATOM 31 CA HIS A 66 7.318 14.688 12.568 1.00 12.48 + A C ATOM 32 C HIS A 66 8.676 14.309 13.156 1.00 11.62 + A C ATOM 33 O HIS A 66 9.708 14.518 12.545 1.00 11.76 + A O ATOM 34 CB HIS A 66 6.450 13.434 12.442 1.00 12.81 + A C ATOM 35 CG HIS A 66 5.000 13.829 12.378 1.00 13.36 + A C ATOM 36 ND1 HIS A 66 4.332 14.002 11.175 1.00 13.57 + A N ATOM 37 CD2 HIS A 66 4.073 14.085 13.360 1.00 13.93 + A C ATOM 38 CE1 HIS A 66 3.063 14.347 11.461 1.00 14.23 + A C ATOM 39 NE2 HIS A 66 2.851 14.410 12.778 1.00 14.47 + A N

    Whether you use substr, or unpack, you'll then be able to feed the input into a data structure as described by Choroba.


    Dave

      I love it. "Taking the last line..." (which is line 570 of the sample input): 66 occurs in the 6th column of the first row and the 293rd row, 16.840 occurs in the 9th column of line 293 from the sample input, and 'B' occurs on the last row and the 293rd, but not on the first.

      You love it? So do I. I also spent a quite bit of time trying to figure out which line the OP was really talking about.

      Anyway, this is fixed-width stuff, so you should probably be thinking in terms of unpack or substr rather than regular expressions.

      I definitely agree that unpack or substr are the most efficient solutions in terms of computing resources (especially unpack, most probably). But, picking on your remark about the programmer standpoint, and assuming that the file is just a few hundreds or thousands of lines, I might as well consider a regular expression, but not a regex similar to what the OP posted, but a very simple one in a call to the split function. Sometimes, with data looking similar to the OP's data, I find it easier to use something like:

      my ($key, $value, $predicate) = (split /\s+/, $line)[0,3,7];
      rather than having to compute the exact position of each piece of data (and testing to make sure that I don't have an off-by-one error). But I am doing that only insofar I am reading a relatively small parameter or reference data file before having to process very large or sometimes huge data sets.

      (Typically, my reference data files have a few hundred or thousand lines, while the real data files to be analyzed have at least dozens of millions of lines, sometimes hundreds of millions lines. In such cases, I really don't care spending a split second more reading the reference data, if I know that processing the main data will take 20 minutes anyway. In other words, I would most probably use the substr or unpack function for the main data, if appropriate, but I don't mind using a slightly slower process for small reference data if it saves me some development time and make the code easier to understand at first glance when I have to maintain it).

      But this was just a side note about slightly specific situations, I agree otherwise fully with just about everything that you said.

        I like your comment, and intend to upvote it. In this specific case it appears that the data set is tame enough that the distinction between fixed-width and space-delimited is moot.

        However, one general principle that I try to adhere to as much as possible is placing the fewest possible demands on a data set as possible. This concept can be generalized from some lessons I learned by reading Effective STL, where Steve Meyers makes some strong cases for why a template container class should place as few requirements on the objects it contains as possible. I'd love to go into the details, but it's a big enough concept that I probably wouldn't do it justice in a simple PerlMonks node.

        Let's take it as a given, then, that the generalized practice of placing as few demands on an entity that we don't control as possible is "a good thing". In particular, doing so helps to simplify our parser, allows us to unambiguously reject data that is broken, and probably even makes it easier to generate valid data.

        So what is the simplest, least demanding set of requirements that we can place on our OP's data? As we look it over, it becomes pretty obvious that it is of fixed-width, and that it is space delimited. ...or is it? What if one of those numeric fields (66, for example) extends to four digits? We already see in his data set places where it extends to three digits. A fourth would cause it to run up against our "[AB]" field. So there's one requirement we have to place on the data set; no column can become filled to the point that it touches the one next to it. 1000 is illegal for the 6th field. Maybe this is reasonable, but I don't know. I do know that as 66 grows to 100, the field widths haven't shifted, so that field size must always be four or less. But I don't know if four digits is a possible in-range value.

        What about blank fields? The user's data set example has no blank fields (that I can detect, though there are some big gaps). \s+ delimited data requires that every field contain something. There's another demand placed on our data set, or if not placed on the data set, another ambiguity that our parser must deal with.

        Next, by looking at his data it seems obvious that there cannot be embedded spaces. However, that is not just an observation, it's a requirement placed on the data. If a field ever changes such that it allows embedded spaces, our parser breaks. And if that ever happens, we run into all sorts of additional demands for our data; embedded spaces must be escaped or quoted, quotes must be balanced if used, embedded quotes must be escaped, and so on.

        This will probably never happen with the user's data set; it may never morph into something more complex. Splitting on space may forever be fine. ...it will have to be fine because the parser now demands it. It can never be permitted to morph into something that includes, for example, a notes field (unless it's in the last position, which is another requirement placed on the data and another rule for the parser), completely full fields, or blank fields.

        So here are the choices for how we can parse fixed-width data:

        1. As fixed width: Must be fixed width.
        2. As space delimited: No full fields, no blank fields, no embedded spaces.

        The first rule seems to be the most likely for this data set. If we treat it as If it's fixed width, we impose only one requirement. And probably that requirement is already part of the implementation of the producer. If we treat fixed width data as space delimited, we impose three additional restrictions on the data. Treat fixed width as fixed width for the most robust solution.


        Dave

Re: How to select specific lines from a file
by choroba (Cardinal) on Apr 29, 2014 at 17:26 UTC
    I'd use a hash of arrays (see perldsc - Perl Data Structures Cookbook). The chain will be the key, there will be two values in each array: the Z-coordinate and the position. Then just iterate the input and test whether the new line should go to the hash or not:
    my %hash; while (<>) { my ($chain, $pos, $zcoor) = (split)[4, 5, 8]; $hash{$chain} = [$pos, $zcoor] if not exists $hash{$chain} or $pos < $hash{$chain}[0]; }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Great, it seems to do the trick... Is it correct that I get all positions in the HoA that you create?
      while(<>) %HoA=(); next unless /^ATOM\b/; @split_atom_line=split(/\s+/, $_); $chain=$split_atom_line[4]; $position=$split_atom_line[5]; $Zcoordinate=$split_atom_line[8]; if (not exists $HoA{$chain} or $position < $HoA{$chain}[0]) { $HoA{$chain} = [$position, $Zcoordinate]; } for $key ( keys %HoA ) { print "$key: $HoA{$key}[0]\n"; } }
        I mean, it keeps adding data although the position and the chain have already been inserted... Shouldn't the HoA contain only two lines in this example, i.e. chains A and B and 1 position for each (the smallest)?
Re: How to select specific lines from a file
by pvaldes (Chaplain) on Apr 30, 2014 at 00:13 UTC

    If you need to parse, write, change and manipulate PDB files you should probably make an effort and consider bioperl. This code is untested and positively wrong, (the docs of BIO::Structure are starting to bite me like a goliath tiger fish), but you can start playing with this..

    use strict; use warnings; use Bio::Structure::IO; #use Data::Dumper; my $in = Bio::Structure::IO->new(-file => "myfile.pdb", -format => 'PDB'); while ( my $struc = $in->next_structure() ){ print "Structure: ", $struc->id; for my $model ($struc->get_models){ print "model: ", $model->id; for my $chain ($struc->get_chains) { if($chain->id eq "A"){ print "we have an A!"; foreach my $res ($struc->get_residues($chain)){ print "Yeah, Honestly, I don't know what I'm doing here"; foreach my $atom ($struc->get_atoms($res)){ print "This is the ATOM ",$atom->id;} } } last; } last; }}

    mmmh, or maybe use Chemistry::File::PDB?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1084358]
Approved by mr_mischief
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-26 05:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found