Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

utime vs. open/close

by jjhorner (Hermit)
on Jun 19, 2000 at 19:20 UTC ( [id://18804]=perlmeditation: print w/replies, xml ) Need Help??

I was looking at maintaining activity time by checking the last access time of a file, and when a user was active, it would update the access time of a file.

To try to streamline my code, I wanted to see if 'utime' was faster than 'open/close', so I wrote the following benchmark:

#!/usr/bin/perl -w use warnings; use Benchmark; my $current_time = time(); my $filename = "test.txt"; sub file_open { open (FILE, "$filename"); close FILE; } sub stat_change { utime($current_time, (stat($filename))[9], $filename); } timethese(3000000, {'openfile' => file_open(), 'changestat' => stat_ch +ange()});

No matter how many iterations I tried, it always returned really low results:

[11:08:40 jhorner@gateway scripts]$ ./20000619-1.pl Benchmark: timing 3000000 iterations of changestat, openfile... changestat: 0 wallclock secs ( 0.40 usr + 0.00 sys = 0.40 CPU) @ 75 +00000.00/s (n=3000000) (warning: too few iterations for a reliable count) openfile: 0 wallclock secs (-0.33 usr + 0.00 sys = -0.33 CPU) @ -9 +090909.09/s (n=3000000) (warning: too few iterations for a reliable count)

So, I decided to add a stumbling block to both subroutines:

#!/usr/bin/perl -w use warnings; use Benchmark; my $current_time = time(); my $filename = "test.txt"; sub file_open { open (FILE, "$filename"); close FILE; my $i; $i++; } sub stat_change { utime($current_time, (stat($filename))[9], $filename); my $i; $i++; } timethese(3000000, {'openfile' => file_open(), 'changestat' => stat_ch +ange()});

The results returned much better, but still not really promising:

[11:10:32 jhorner@gateway scripts]$ ./20000619-1.pl Benchmark: timing 3000000 iterations of changestat, openfile... changestat: 1 wallclock secs ( 0.88 usr + 0.00 sys = 0.88 CPU) @ 34 +09090.91/s (n=3000000) openfile: 2 wallclock secs ( 0.91 usr + 0.00 sys = 0.91 CPU) @ 32 +96703.30/s (n=3000000)

While the benchmark told me that it only took 1 wallclock sec, it really took 10 or 15 seconds.

So, my meditation is this:

Should there be a better method of benchmarking this type
of test, or should I find a better way to benchmark?

The test, while far from infallible, should give reasonable
results.  It doesn't however.  Is 'utime' better than 'open/close'
or vice versa?  Is this all a method of system performance on I/O?

Is this question even worth asking?
jj

Replies are listed 'Best First'.
RE: utime vs. open/close
by plaid (Chaplain) on Jun 20, 2000 at 00:28 UTC
    Several comments about your code.

    First, it would probably be better to put the line my $current_time = time(); inside the stat_change sub, as the getting of the current time isn't something that both subs need, so can't fairly be factored out.

    Secondly, in the file_open sub, you are opening the file for reading, which won't create it if it's not there, and if it is there it won't update the modification time.

    Third, I don't really see a reason for that stat call in stat_change, as setting the modification time to $current_time as well probably won't have any reason to keep the modification time the same, if I remember your program correctly.

    Fourth, and this is the major problem with the code, you're not passing the subroutine arguments to timethese correctly. You need to pass them by reference, but what you're doing is actually calling them and passing the return values to timethese.

    My code:

    #!/usr/bin/perl -w use Benchmark; my $filename = "test.txt"; sub file_open { open (FILE, ">$filename"); close FILE; } sub utime1 { my $current_time = time(); utime($current_time, (stat($filename))[9], $filename); } sub utime2 { my $current_time = time(); utime($current_time, $current_time, $filename); } timethese(500000, { 'openfile' => \&file_open, 'utime1' => \&utime1, 'utime2' => \&utime2 }); Benchmark: timing 500000 iterations of openfile, utime1, utime2... openfile: 32 wallclock secs (19.46 usr + 11.91 sys = 31.37 CPU) utime1: 18 wallclock secs (11.93 usr + 5.75 sys = 17.68 CPU) utime2: 7 wallclock secs ( 3.92 usr + 3.03 sys = 6.95 CPU)
    Note: Since I'm the one who gave you the open/close idea in the first place, I just thought I'd say that if I had remembered utime when I was making that post, I would definitely have suggested that instead:)

      Good response, but I have a few counter points:

      First point: Didn't think of that.

      Second point: The file is there. I don't want creating the file to be part of the issue.

      Third point: I'm working on a different version for work where I need both file access time and file modification time.

      Fourth point: I didn't really consider that. Oops. I figured there was something wrong somewhere.

      Your utime vs openfile comparison is probably what I need most. Since I need to find the best way to update access time only, I need to open read-only and just update access time.

      Thanks for the response.

      J. J. Horner
      Linux, Perl, Apache, Stronghold, Unix
      jhorner@knoxlug.org http://www.knoxlug.org/
      
RE: utime vs. open/close
by Odud (Pilgrim) on Jun 19, 2000 at 22:59 UTC
    Presumably all that is happening is that it is only the data in the buffer cache that is changing and so all you are really seeing is the same as modifying a chunk of memory. (I'm presuming you're running Unix here)

    For interest I ran it on NT with the following results
    Benchmark: timing 3000000 iterations of changestat, openfile... changestat: 4 wallclock secs ( 3.88 usr + -0.02 sys = 3.86 CPU) @ 778 +008.30/s (n=3000000) openfile: 0 wallclock secs ( 1.01 usr + -0.02 sys = 0.99 CPU) @ 3024 +193.55/s (n=3000000)
    . This was running on an ancient Pentium 100MHz! - very little disk activity detected so I guess we are seeing caching again. As far as I can tell the whole subject of benchmarking is prone to this sort of behaviour - I think it is very difficult to interpret the results unless you understand what was going on. If the machine was heavily loaded and there was contention for cache then you might see different results.

    Have you thought about loading the machine? - You could have a number of processes constantly reading/writing files big enough to cause the cache to be written back to the disk and so hopefully your writes will involve real I/O.

    Perhaps once Athena is complete the next project could be a set of benchmarks to facilitate machine and Perl version comparisons - or does such a thing exist already?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://18804]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-03-28 13:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found