Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: efficient way to read a file in reverse

by tybalt89 (Prior)
on Jan 07, 2021 at 15:13 UTC ( #11126536=note: print w/replies, xml ) Need Help??


in reply to efficient way to read a file in reverse

XY problem?

Looking at your code it appears you are reading kernel logs for temperature problems, but only after a certain unix time.
It may be possible to use Search::Dict to find the first line in the file on or after a specified time and then read forward to the end.
No backward reading would be required.

It would look something like this:

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11126426 use warnings; use Search::Dict; use Date::Parse qw( str2time ); use Time::HiRes qw( time ); my $day = 60 * 60 * 24; # commented code used to create 209M file; #my $str = join '', map { localtime( time + $_ * $day) . # " kernel: log entry\n" } -5e6 .. 5; #print $str; #use Path::Tiny; path('d.searchdict')->spew($str); #print "string length = @{[length $str]}\n"; my $want = time - 1.1 * $day; my $start = time; open my $fh, '<', 'd.searchdict' or die; look $fh, $want, { comp => sub { $_[0] <=> $_[1] }, xfrm => sub { str2time substr shift, 0, 24 }, }; printf "look took %.3f seconds\n", time - $start; while( <$fh> ) # now read to end of file { print; }

On the other hand, maybe a real read backwards would work better, so here's a simple package I threw together:

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11126426 use warnings; my $str = join '', map '.' x $_ . "this is line $_\n", 1 .. 60; print $str; print "string length = ", length $str, "\n"; my $backwards = Tybalt89BackwardsHeReads->new( \$str ) or die; while( defined( $_ = $backwards->line ) ) { print; } package Tybalt89BackwardsHeReads; #################################### +### sub line { my ($self) = @_; if( @{ $self->{lines} } == 0 and $self->{where} ) { my $window = 1024; # window size, adjust to suit my $pos = $self->{where} - $window; $pos < 0 and $pos = 0; seek $self->{fh}, $pos, 0; read $self->{fh}, my $data, $self->{where} - $pos; $pos and $data =~ s/^.*\n// ? ($pos += $+[0]) : die "increase wind +ow size"; $self->{lines} = [ split /^/, $data ]; $self->{where} = $pos; } return pop @{ $self->{lines} }; } sub new { my ($self, $filename) = @_; open my $fh, '<', $filename or die "$! on $filename"; bless { fh => $fh, where => ref $filename ? length $$filename : -s $filename, lines => [] }, ref $self || $self; } 1; # so if split off, package ends with true

Use your real filename in the ->new() call instead of the string reference I was using for testing. And you can remove the string generation code also.

You may want to change the package name :)

P.S. I enjoyed writing the read backwards code, thanks for the inspiration!

Replies are listed 'Best First'.
Re^2: efficient way to read a file in reverse
by cmcl (Novice) on Jan 07, 2021 at 17:25 UTC
    Hi Tybalt89,

    Glad you enjoyed it! I omitted the first bit of the code where I do the timestamps for brevity's sake, but yes, the purpose is to look for the last 24 hours of logs for any thermal throttling messages. Thanks for the code, I think that will be quicker for me to adapt than the original File::ReadBackwards (I'll dig into 'tie filehandle' when I get a chance sometime), and there's some things new to me in your example as well which look interesting.

    Cheers,
    Cam

      Since I haven't played with tied file handles before, here's the previous package with the ability to use it either as an object, or as a tied file handle.

      #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11126426 use warnings; tie *BACKWARDS, 'Tybalt89BackwardsHeReads', $0 or die 'tie failed'; while( <BACKWARDS> ) { print; } package Tybalt89BackwardsHeReads; #################################### +### BEGIN { *TIEHANDLE = \&new; *READLINE = \&line; } sub line { my ($self) = @_; while( @{ $self->{lines} } == 0 and $self->{where} ) { my $window = 1024; # window size, adjust to suit my $pos = $self->{where} - $window; $pos < 0 and $pos = 0; seek $self->{fh}, $pos, 0; read $self->{fh}, my $data, $self->{where} - $pos; $pos and $data =~ s/^\N*\n(?=.)//s ? ($pos += $+[0]) : die "increase window size"; $self->{lines} = [ split /^/, $data ]; $self->{where} = $pos; } return pop @{ $self->{lines} }; } sub new { my ($self, $filename) = @_; open my $fh, '<', $filename or die "$! on $filename"; bless { where => ref $filename ? length $$filename : -s $filename, fh => $fh, lines => [] }, ref $self || $self; } 1; # so if split off, package ends with true

      Replace the $0 with the name of your file. I was just using it for debugging.

      Please use this latest version of "sub line", it solves a problem with the fetching the line from the beginning of the file when the window is too small.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11126536]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2021-03-01 10:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?