Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^3: File::Temp: 2 interfaces get different results with Digest::MD5 and File::Compare

by Corion (Patriarch)
on Aug 29, 2021 at 17:50 UTC ( [id://11136186]=note: print w/replies, xml ) Need Help??


in reply to Re^2: File::Temp: 2 interfaces get different results with Digest::MD5 and File::Compare
in thread File::Temp: 2 interfaces get different results with Digest::MD5 and File::Compare

You're using say instead of print, so whitespace certainly is involved.

You disabled the layers on reading the data back but did you disable the layers when writing the file? I think you're usually on Windows and there, Perl (and say) will usually output \r\n to files.

Update:On further inspection, the file sizes of the two files are identical, so there is something else afoot. Sorry for this noise.

I looked at replicating your situation using IO layers, but while I can provoke a difference using the :crlf filehandle, I don't get the digests you posted:

#!perl use 5.14.0; use strict; use warnings; use Carp; use Data::Dumper; use Digest::MD5; use File::Compare (qw| compare |); use File::Temp qw( tempfile ); use Test::More tests => 1; my $basic = 'x' x 10**2; my @digests; my ($fh1, $t1) = tempfile(); binmode $fh1, ':raw'; for (1..100) { say $fh1 $basic } close $fh1 or croak "Unable to close $t1 after writing"; push @digests, hexdigest_one_file($t1); diag "$t1: $digests[0]"; my $t3 = File::Temp->new( UNLINK => 0); binmode $t3, ':crlf'; for (1..100) { say $t3 $basic } close $t3 or croak "Unable to close $t3 after writing"; push @digests, hexdigest_one_file($t3); diag "$t3: $digests[1]"; is $digests[0], $digests[1]; sub hexdigest_one_file { my $filename = shift; say "Filename: $filename"; #open my $FH, '<', $filename or croak "Unable to open $filename fo +r reading"; #print for <$FH>; #close $FH; my $state = Digest::MD5->new(); open my $FH, '<:raw', $filename or croak "Unable to open $filename + for reading"; $state->addfile($FH); close $FH or croak "Unable to close $filename after reading"; return $state->hexdigest; }
1..1 Filename: /tmp/MdfRQx3DVl # /tmp/MdfRQx3DVl: e395fd01f84d7d1006a99e2a6b8fb832 Filename: /tmp/x589MI1yYB # /tmp/x589MI1yYB: 7651c6edc9ebdcfa617bcc99e1c8a6f2 not ok 1 # Failed test at tmp.pl line 29. # got: 'e395fd01f84d7d1006a99e2a6b8fb832' # expected: '7651c6edc9ebdcfa617bcc99e1c8a6f2' # Looks like you failed 1 test of 1.

Update2 Have you asked md5sum about which sum is correct? For my code, md5sum outputs hashes identical to what Perl computes for each file.

  • Comment on Re^3: File::Temp: 2 interfaces get different results with Digest::MD5 and File::Compare
  • Select or Download Code

Replies are listed 'Best First'.
Re^4: File::Temp: 2 interfaces get different results with Digest::MD5 and File::Compare
by jkeenan1 (Deacon) on Aug 29, 2021 at 21:47 UTC
    Corion,

    Based on your suggestion, I developed the workaround below. The trick seems to have three parts to it:

    1. binmode $FH, ':raw': binmode the tempfile(handle) before writing to it. (I suspect that on Unix, we can get away without the ':raw', but whatever.)

    2. close $FH: close the tempfile(handle) after writing to it and before calling hexdigest on it.

    3. Ignore File::Compare::compare() for now. (I don't need for my real-world problem, anyway.)

    #!perl use 5.14.0; use warnings; use Carp; use Data::Dumper; use Digest::MD5; use File::Temp qw( tempfile ); use Test::More; sub hexdigest_one_file { my $filename = shift; say "Filename: $filename"; my $state = Digest::MD5->new(); open my $FH, '<', $filename or croak "Unable to open $filename for + reading"; $state->addfile($FH); close $FH or croak "Unable to close $filename after reading"; return $state->hexdigest; } my $basic = 'x' x 10**2; my @digests; my ($fh1, $t1) = tempfile(); binmode $fh1, ':raw'; for (1..100) { say $fh1 $basic } close $fh1 or croak "Unable to close $t1 after writing"; push @digests, hexdigest_one_file($t1); my $t3 = File::Temp->new( UNLINK => 0); binmode $t3, ':raw'; for (1..100) { say $t3 $basic } close $t3 or croak "Unable to close $t3 after writing"; push @digests, hexdigest_one_file($t3); say Dumper [ @digests ]; cmp_ok($digests[0], 'eq', $digests[1], "Same md5_hex for $t1 and $t3"); done_testing();

    Why this works I do not know. I think this is, at the very least, a deficiency in the File::Temp documentation and will file a bug report on it.

    Thank you for your assistance.

    Jim Keenan

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11136186]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-26 00:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found