Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Poor man's diff

by grinder (Bishop)
on May 23, 2005 at 14:22 UTC ( #459572=snippet: print w/replies, xml ) Need Help??

I have a windows server that I refuse to work on as much as possible, although fortunately it has a copy of Perl 5.6.1 installed. In the process of replacing a batch process that spits out interface files that are imported on another system, I needed to test whether what I was producing was the same.

You cannot, as far as I am aware, tell Text::Diff to ignore trailing blanks as being insignificant. So I whipped up the following poor man's diff to do, well, a poor man's diff (no, I don't have Cygwin installed). It showed me that I had a single field in my new replacement that differs from the original. Tracing things further, I found that the original program had a SQL bug in which table t1 was being joined to table t2 instead of t3. Funnily enough, the comments in the original program said that the select was supposed to be on t3.

Exit one 1314-line C program, to be replaced by a 278-line Perl program (of which 95 lines is an SQL select heredoc).

Note that this diff does not notice, nor does it care, if the files are of different lengths. Additional lines in the longer file will be silently ignored. In certain circumstances, this could be construed as a feature.

#! perl -w

# -- a poor man's diff

use strict;

my $in1 = shift || die "no first file";
my $in2 = shift || die "no second file";

open IN1, $in1 or die "input $in1: $!\n";
open IN2, $in2 or die "input $in2: $!\n";

my $nr = 0;
while( defined( my $r1 = <IN1> ) and defined( my $r2 = <IN2> ) ) {
    $r1 =~ s/\s+$//;
    $r2 =~ s/\s+$//;
    print "files differ at line $nr\n" if $r1 ne $r2;
Replies are listed 'Best First'.
Re: Poor man's diff
by kaif (Friar) on Jun 03, 2005 at 20:28 UTC
    Good code.

    However, you said that you knew no way to "tell Text::Diff to ignore trailing blanks as being insignificant." I decided to explore this and found that the following code works:

    #!/usr/bin/env perl die "Usage: $0 from-file to-file\n" unless @ARGV == 2; use Text::Diff; diff shift, shift, { OUTPUT => \*STDOUT, STYLE => "OldStyle", # or whatever pleases you KEYGEN => sub{ (my $line = shift) =~ s/\s*$//; return $line; }, };
    In general, to compare something other than the lines themselves, just return that from the KEYGEN argument. For example, inserting sub{return substr shift, 0, 1} compares only the first characters. Unfortunately, the documentation for this is hidden in Algorithm::Diff.
Re: Poor man's diff
by lupey (Monk) on Jun 09, 2005 at 11:53 UTC
    I like your script for its simplicity if one wants a quick answer to what lines differ.

    Having read that you don't have Cygwin installed, I strongly recommend that you or anybody else does so. Or install GNU utilities for Win32. Once you start to think that you need to write these types of scripts, it would be worth it to install Cygwin. It has saved me countless hours of trying to write my own stuff that already exists on a *nix system.

    I write this because my hope is that it will help many others too.


    unashamed Cygwin advocate

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: snippet [id://459572]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2023-12-10 00:07 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (38 votes). Check out past polls.