good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
Re^2: How to select specific lines from a fileby Laurent_R (Canon) |
on Apr 29, 2014 at 21:05 UTC ( [id://1084392]=note: print w/replies, xml ) | Need Help?? |
I love it. "Taking the last line..." (which is line 570 of the sample input): 66 occurs in the 6th column of the first row and the 293rd row, 16.840 occurs in the 9th column of line 293 from the sample input, and 'B' occurs on the last row and the 293rd, but not on the first. You love it? So do I. I also spent a quite bit of time trying to figure out which line the OP was really talking about. Anyway, this is fixed-width stuff, so you should probably be thinking in terms of unpack or substr rather than regular expressions. I definitely agree that unpack or substr are the most efficient solutions in terms of computing resources (especially unpack, most probably). But, picking on your remark about the programmer standpoint, and assuming that the file is just a few hundreds or thousands of lines, I might as well consider a regular expression, but not a regex similar to what the OP posted, but a very simple one in a call to the split function. Sometimes, with data looking similar to the OP's data, I find it easier to use something like: rather than having to compute the exact position of each piece of data (and testing to make sure that I don't have an off-by-one error). But I am doing that only insofar I am reading a relatively small parameter or reference data file before having to process very large or sometimes huge data sets. (Typically, my reference data files have a few hundred or thousand lines, while the real data files to be analyzed have at least dozens of millions of lines, sometimes hundreds of millions lines. In such cases, I really don't care spending a split second more reading the reference data, if I know that processing the main data will take 20 minutes anyway. In other words, I would most probably use the substr or unpack function for the main data, if appropriate, but I don't mind using a slightly slower process for small reference data if it saves me some development time and make the code easier to understand at first glance when I have to maintain it). But this was just a side note about slightly specific situations, I agree otherwise fully with just about everything that you said.
In Section
Seekers of Perl Wisdom
|
|