Best way to read line x from a file

Melly has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Best way to read line x from a file by Corion (Patriarch) on Mar 29, 2004 at 15:34 UTC
I think that Tie::File is the best compromise between speed and simplicity for such tasks. The fastest way would be a loop like the following, assuming that the line indicated by `$line_no` will start in the first half of the file: `<FILE> while ($line_no--); my $line2 = <FILE>;` [download] as then, Perl and the OS will do some buffering for you, and you don't read the whole file for nothing. If your file size is smaller than one sector, the OS (and the HD) will read it into memory anyway, and it might be faster to slurp it into memory and use a crafted regular expression against it: `use File::Slurp qw(slurp); my $f = slurp $filename; my $line2 = $1 if (m!\n{$line_no-1}([^\n]*)\n!sm);` [download] So in the end, you will have to benchmark a lot.	[reply] [d/l] [select]
Re2: Best way to read line x from a file by Hofmator (Curate) on Mar 30, 2004 at 12:17 UTC
A couple of small things went wrong in your 2nd example. `slurp` is spelled `read_file`, at least in the newest CPAN version 9999.04. The regex is matching against `$_`, not $f. The regex is not working in multiple ways :). `{$line_no-1}` evaluates to e.g. `{10-1}` which looks for the literal string `'{10-1}'`. Even if this worked, it would be looking for consecutive newlines, so consecutive empty lines. And there are more mistakes ... The correct version should be something like this: `use File::Slurp; my $f = read_file $filename; my $line2 = $1 if ($f =~ m!\A(?:.\n){@{[$line_no-1]}}(.)\n!m);` [download] ... but I wouldn't recommend it. And for the sake of completeness, here the solution spelled out with Tie::File which lots of people mentioned already. `use Tie::File; tie my @file, 'Tie::File', $filename or die "Couldn't tie '$filename': $!"; my $line2 = $file[9];` [download] -- Hofmator	[reply] [d/l] [select]
Re: Best way to read line x from a file by arden (Curate) on Mar 29, 2004 at 15:31 UTC
Unless your lines are all the same length, I think the best way is probably that which you've chosen. If, however, your lines are all the same length, you could use seek();. Here is how I read in a specific line from a file if I am only interested in the one bit of a file. `$. = 0; do { $LINE = <FILE> } until $. == $DESIRED_LINE_NUMBER \|\| eof;` [download] Now, if you're going to potentially bounce around within the file (say, look at line 5000, then line 20, then line 42, etc), there are other strategies, but since I don't think that's what you're looking for, we won't go there right yet. . . - - arden. arden is more of an orangutan than a monkee	[reply] [d/l]
Re: Re: Best way to read line x from a file by Melly (Chaplain) on Mar 29, 2004 at 15:35 UTC
Is it wasteful? - i.e. does such notation force perl to read in the whole file, or does it just read in the lines up to line x, or does it (we can but hope) somehow just read in line x? Tom Melly, tom@tomandlu.co.uk	[reply]
Re: Re: Re: Best way to read line x from a file by arden (Curate) on Mar 29, 2004 at 15:55 UTC
No, seek() is not wasteful, however it doesn't really understand the concept of a line either. Seek basically blitzes its way to the location requested, so any future reads start from that location. You can also use seek to go backwards in a file too. But again, it doesn't work on the principle of "lines", instead it works on "byte offsets". That's why in your case it would only work if every line is of the same length. - - arden. arden is more of an orangutan than a monkee	[reply]
Re: Best way to read line x from a file by davido (Cardinal) on Mar 29, 2004 at 16:09 UTC
`my $line2 = (<FILE>)[9];` Your method evaluates <FILE> in list context, resulting in a file slurp. Then you index into only one line, and let the rest of the slurp fall into the bit-bucket. I agree with Corion that Tie::File is a great solution. But I couldn't leave well enough alone, and had to come up with yet another way to do it. This solution still reads through the file up until it gets to the desired line. There's no way around that unless your lines are fixed-length.: `my $linenum = 10; while ( my $line = <FILE>) { next unless $. == $linenum; # Process the one line here... last; # No need to continue. }` [download] I hadn't seen anyone using $. yet. See perlvar. Update:Added `last;` to the loop. Thanks for the reminder. Dave	[reply] [d/l] [select]
Re: Re: Best way to read line x from a file by TomDLux (Vicar) on Mar 29, 2004 at 16:50 UTC
Don't forget to make sure that "Procecss the one line..." includes the command last, to exit the loop, otherwise you simply have an expanded version of slurp. -- `TTTATCGGTCGTTATATAGATGTTTGCA`	[reply]
Re: Best way to read line x from a file by ctilmes (Vicar) on Mar 29, 2004 at 16:05 UTC
You might also consider using Mmap. You can treat the file as a variable, and only the portions of it that you actually access will get read from disk, and then in a very efficient manner.	[reply]
Re: Best way to read line x from a file by ambrus (Abbot) on Mar 29, 2004 at 18:32 UTC
As others have said, this is wasteful because it reads the whole file while it should read only the first 9 lines. If you want a solution that has no visible loop (or map etc), you could try using the module `Tie::File`. This module is in the standard Perl distrib. (Note that Tie::File numbers the lines with zero-offset.) Otherwise, for me `$l= <$F> for 1..9;` [download] seems the best solution but there might be a more elegant one.	[reply] [d/l] [select]
Re: Best way to read line x from a file by gmpassos (Priest) on Mar 29, 2004 at 16:54 UTC
Well, you really need to read line by line to ensure that you are in line X, unless you have fixed line sizes. Other thing that you can do, to avoid to alwasy read all the file, is to save something like an index of the position in bytes of some lines in an extra file. Soo, for a big file you can have some indexed lines, and when you want to go to line X, you choose the nearest indexed line to start to search for line X, but note that the search for the nearest line in the index need to be very fast and small, or you won't get too much optimization. Graciliano M. P. "Creativity is the expression of the liberty".	[reply]
Re: Best way to read line x from a file by flyingmoose (Priest) on Mar 29, 2004 at 19:13 UTC
Hi Monkees, Hey hey, we're the Monkees, people say we monkey around, but we're too busy coding, to put the Camel down... Somebody else, next verse...	[reply]
Re: Re: Best way to read line x from a file by Popcorn Dave (Abbot) on Mar 29, 2004 at 20:53 UTC
We're just trying to be friendly, we only want to code all day.And if you don't use strict, we're gonna have something to say. And the real lyrics. :) There is no emoticon for what I'm feeling now.	[reply]


Do you know where your variables are?
	PerlMonks