Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^2: File::Sort issues

by aartist (Pilgrim)
on Jul 11, 2011 at 23:53 UTC ( [id://913808] : note . print w/replies, xml ) Need Help??

in reply to Re: File::Sort issues
in thread File::Sort issues

How this will handle the case of missing lines in the file.

Replies are listed 'Best First'.
Re^3: File::Sort issues
by Somni (Friar) on Jul 12, 2011 at 00:22 UTC
    Duplicate lines, missing lines, extra lines, etc. are all displayed by diff, and controlled through its options.

    For example, unified diff (-u) will show +'s for new lines, and -'s for removed lines. If large chunks of lines are added, removed, or moved, diff will show various sets of +'s and -'s; you can use -d (--minimal) to reduce the changes shown.

    The point is, once you've normalized the files you have a wide variety of comparison tools available: straight text from diff; side by side comparison in an editor with vimdiff; byte-by-byte comparison with cmp; sort and uniq to reduce it to some subset; and so on.