Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Reporting "percent done" while processing a large file

by mbond (Beadle)
on Sep 18, 2001 at 16:40 UTC ( [id://113099]=perlquestion: print w/replies, xml ) Need Help??

mbond has asked for the wisdom of the Perl Monks concerning the following question:

A friend and I were working on a script to weed through an 80MB file and then copy the results to a new location. The topic of printing a percent complete to the screen while various stages of the task was being completeled came up.

Is there a good way to do this?

Mbond
  • Comment on Reporting "percent done" while processing a large file

Replies are listed 'Best First'.
Re: status
by MZSanford (Curate) on Sep 18, 2001 at 16:50 UTC
    I am a huge proponent of sttaus messages, but with text files, to do a percent you would have to first get the number of lines (which on large data is a waste). I tend to do the following for text data :
    while (my $line = <INFILE>) { if ( ($. % 5000) == 0) { print "Processed $. records\n"; # or, just to let them know # print "."; } ## process away }
    Now, being Mr. Binary-data-can-be-done-with-perl, i tend to do alot of binary op's on files, and for that you can do (this is untested, but mostly just to show the idea):
    my $TOTAL = (-s $FILE); while (read(INFILE,$buf,$bufsize)) { ## do work $BYTES_READ+=$bufsize; if ( ($BYTES_READ % 5000) == 0) { printf("%.3f %% (%d of %d)\n",($BYTES_READ/$TOTAL)*100,$BYTES_ +READ,$TOTAL); } }

    my own worst enemy
    -- MZSanford
      Well, with text, you can still use the latter solution as to not determine the number of lines beforehand if the file is not a fixed record file:
      my $total = ( -s $FILE ); my $cummulative = 0; while ( my $line = <FILE> ) { $cummulative += length( $line ); do_processing( $line ); printf( "%2d%% Done\n", $cummulative/$total*100.0 ); }

      -----------------------------------------------------
      Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      It's not what you know, but knowing how to find it if you don't know that's important

      Perhaps its just me, but I always find the construct:
      if ( ( whatever ) == 0 ) { }
      easier to read, when written like so:
      unless ( whatever ) { }
      So, I would have written the above as:
      unless ($. % 5000) { print "Processed $. records\n"; # or, just to let them know # print "."; }

      -Blake
      off to try that cool -s trick for status bars....

Re: Reporting "percent done" while processing a large file
by clintp (Curate) on Sep 18, 2001 at 20:06 UTC
    Whether you're processing text or binary, the point is to show some sort of progress towards the end goal. Now for text this kind of assumes that the lines are somewhat balanced (the standard deviation of their lengths isn't outrageous). You don't need to be too exact though, and tell() can be your friend:
    open(F, "/usr/dict/words") || warn; # File to process $s=-s F; $|=1; while(<F>) { # Binary read() or readline()...whatever. # Do something with the data I presume... printf "\rComplete %.2f", 100*tell(F)/$s; }
    For large files this slows things down considerably. You may want to try:
    printf("\rComplete: %.2f", 100*tell(F)/$s) unless $i++ % 10;
    Or something like it so there's not quite so much output.
Re: Reporting "percent done" while processing a large file
by archen (Pilgrim) on Sep 19, 2001 at 00:00 UTC
    Well my Perl skills are pretty weak, but here's what I usually do: I make a cheesy text status bar. More or less print a line that sort of marks where the progress is, then print an '*' for something like every 5% done. Usually I just use the stat function to find the file size, and divide by the number of astriks you want in the bar, etc (this similar to a computer science problem I once had in college). There is some computational overhead with comparisons with an accumulator, but as far as I can tell, not a whole lot. (I'd post some code, but truthfully my code looks pretty sad).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://113099]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-19 12:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found