Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

file comparison

by sqspat (Acolyte)
on Aug 25, 2009 at 09:09 UTC ( [id://791020]=perlquestion: print w/replies, xml ) Need Help??

sqspat has asked for the wisdom of the Perl Monks concerning the following question:

Hello I have a question in relation to file comparisons and the best way to work it. I have a script which logs into a box, runs some commands and puts the results of those commands into an array. Now I would like to compare the results of those commands to another file and not sure how to do this or more to the point the best way of doing this... any suggestions??

Replies are listed 'Best First'.
Re: file comparison
by jethro (Monsignor) on Aug 25, 2009 at 09:29 UTC
    If you just want to know if the two versions are exactly the same, just put them both in a string and use string compare

    my $answer= log_in_box_and_do_something(); open(F,'<',$filename) or die "Could not open $filename\n"; my $compare= <F>; if ($answer eq $compare) { #both are equal }

    This is a very basic method, the slightest difference will throw it off, even if it is only a space character. If you want more, you could use a command line utility like diff (on unix/linux) or extract the important elements with some regexes and compare those to the file

Re: file comparison
by james2vegas (Chaplain) on Aug 25, 2009 at 09:16 UTC
    You could look at Algorithm::Diff or its child Text::Diff for diffing files or sets of outputs. What is the eventual goal of the diffing?
      Thanks for your reply. I am wanting to verify that what is returned from various commands on the box does not change when the code changes.. i.e. I want to make sure the same response to the commands are obtained when I install new code .... it is part of some automated regression tests I am scripting.

        Oh, in that case just compare the outputs with the cmp program if the outputs are written to files (eg. system("diff", "-s", "--", $filename1, $filename2) returns true if the files are different).

        If you're sure you want the outputs in arrays, compare them with like join("", @output1) eq join("", @output2) if the lines in the arrays still have the trailing newlines or pack("(J/a)*", @output1) eq pack("(J/a)*", @output2) if you're not sure what the delimiters are or @output1 == @output2 and !grep { $output1[$_] ne $output2[$_] } 0 .. @output1 - 1 if you don't want to waste memory.

Re: file comparison
by ambrus (Abbot) on Aug 25, 2009 at 10:22 UTC

    Could you be more precise? Do you want to compare two files by finding lines that are in one of them but not in the other? Or do you want a line-by-line diff where order of lines matters as well? If the latter, you can try calling the external diff program eg. like in Htmlify code with differences.

Re: file comparison
by bluestar (Novice) on Aug 25, 2009 at 16:39 UTC
    Comparing digests of the data is one way to do this i.e.
    use strict; use warnings; use Digest::SHA; open my $in_1, '/tmp/f1.txt'; open my $in_2, '/tmp/f2.txt'; my $hash1 = Digest::SHA::sha1_hex( do { local $/; <$in_1> } ); my $hash2 = Digest::SHA::sha1_hex( do { local $/; <$in_2> } ); die "The files are NOT the same\n" unless $hash1 eq $hash2; print "The file contents are identical\n";
    If your files are really large i.e. in the order of a few gigabytes you could use md5sum on the command line to compare the files instead :
    md5sum file1 file2 file3 etc
    Identical files will yield identical hashes. This is useful because you do not have to load the entire file into memory first.
Re: file comparison
by ruzam (Curate) on Aug 25, 2009 at 17:15 UTC
    If the files are small (which is relative to your system resources), you could simply slurp the files into variables and do a straight out 'eq' test.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://791020]
Approved by rovf
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-26 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found