Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Determine new line number in revised file given old line number

by ExReg (Priest)
on Mar 29, 2017 at 18:23 UTC ( [id://1186417]=perlquestion: print w/replies, xml ) Need Help??

ExReg has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that lists thousands of lines of interest in a large baseline of code. Each line will list the module and line number. When I get an updated list of lines at a later date, becasue of changes to the baseline, some lines will stay the same, there will be some new lines, there will be some lines that go away, and some lines will have their line number change.

If a module has a line inserted into it or deleted between the two times, the line number of interest will be different. Without doing a diff on the two files, it is difficult to tell whether that changed line listing is due to a line number change or if there was a deletion and insertion.

Since I have thousands of lines tracked, doing manual diffs is not an option. So I wrote a Perl script to automate the process. I can give it the old module, the updated module, and the line number of interest. It will give me what that line number would be in the new module, or 0 if it is no longer there or changed. That way I can tell if it was a line number change, or an addition plus deletion instead.

It does a diff on the two files, then uses the n1an2 or n1,n2cn3 or whatever lines diff gives to get an array of new to old line numbers. I then have to reverse that array into another to get the old to new line numbers. Only then can I use the line number as an index to that array to get the line number in the updated module.

Has anyone had to do something similar? It seems a little convoluted. Is there an easier way to determine what the new line number would be in a modified file? ( I hope I copied my listing correctly. )

# updated_line_number.pl # updated_line_number.pl old_file revised_file line_number # # compares old_file to revised_file and will determine what a line num +ber in old_file will be for revised_file # use strict; my ( $old_file, $new_file, $line_number ) = @ARGV; my @new_to_old; my @old_to_new; my $lines_in_new = 0; my $lines_in_old = 0; initialize_arrays(); determine_differences(); print "$old_to_new[$line_number]\n"; sub initialize_arrays{ # Initialize arrays. Size is old + new to give shrink and grow roo +m during calculations. # Just ignore extra when done. open my $new_file_handle, "<", $new_file or die "Unable to open $n +ew_file\n"; $lines_in_new++ while ( <$new_file_handle> ); open my $old_file_handle, "<", $old_file or die "Unable to open $o +ld_file\n"; $lines_in_old++ while ( <$old_file_handle> ); $old_to_new[$_] = 0 for ( 1 .. $lines_in_new + $lines_in_old ); $new_to_old[$_] = $_ for ( 1 .. $lines_in_new + $lines_in_old ); } sub determine_differences{ for my $capture_line ( `diff $old_file $new_file` ) { # Only process lines starting with number. Ignore <, >, or --- + lines. if ( $capture_line =~ /^(?:(\d+),)?(\d+)([acd])(?:(\d+),)?(\d+ +)$/ ) { $3 ne 'd' and $new_to_old[$_] = 0 for ( ( $4 ? $4 : $5 ) . +. $5 ); $new_to_old[$_] += ( $4 ? $4 : $5 ) - $5 + $2 - ( $1 : $1 +? $2 ) + ( $3 cmp 'c' ) for ( $5 + 1 .. $lines_in_old + $lines_in_new ); } } # Now get inverse to go from old to new. # There will be complete arrays going old to new and new to old ju +st to get one answer. for my $n ( 0 .. $lines_in_old + $lines_in_new ) { if ( $new_to_old[$n] > 0 ) { $old_to_new[$new_to_old[$n]] = $n; } else { $old_to_new[$n] = 0; # If no match, make line number zero. } } }

Replies are listed 'Best First'.
Re: Determine new line number in revised file given old line number
by tybalt89 (Monsignor) on Mar 29, 2017 at 19:55 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1186417 use strict; use warnings; use Algorithm::Diff qw(traverse_sequences); my @one = do { local @ARGV = $ARGV[0]; <> }; my @two = do { local @ARGV = $ARGV[1]; <> }; my $wantedoldline = $ARGV[2]; my $oldline = my $newline = 0; traverse_sequences( \@one, \@two, { MATCH => sub { $oldline++; $newline++; if( $oldline == $wantedoldline ) { print "$newline\n"; exit; } }, DISCARD_A => sub { $oldline++ }, DISCARD_B => sub { $newline++ }, } ); print "0\n";

      For me it is amazing how you always present clear, succinct solutions to given problems without much ado. Without teaching, you teach. Learn from this code, which is pushing the bar in the threads context.

      Please stay a while with us, we could learn so much from you.

      I fancy a node called "Zen, perlgolf and the art of coding"... ah well.

      Big thanks, tybalt89++

      perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Determine new line number in revised file given old line number
by shmem (Chancellor) on Mar 29, 2017 at 19:31 UTC
    syntax error at 1186417.pl line 33, near "$1 :"

    patch:

    - $new_to_old[$_] += ( $4 ? $4 : $5 ) - $5 + $2 - ( $1 : $1 ? $2 ) + ( + $3 cmp 'c' ) + $new_to_old[$_] += ( $4 ? $4 : $5 ) - $5 + $2 - ( $1 ? $1 : $2 ) + ( + $3 cmp 'c' )
    It seems a little convoluted.

    For starters, to remove a bit convolutedness, I'd drop that ternary constructs and rewrite that line as

    $new_to_old[$_] += ( $4 || $5 ) - $5 + $2 - ( $1 || $2 ) + ( $3 cmp 'c +' )
    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

      Thanks for catching the typo. I love the improvement for the ternaries. I always thought it looked ugly.

Re: Determine new line number in revised file given old line number
by Anonymous Monk on Mar 29, 2017 at 18:37 UTC
    Sample inputs & expected output?

      Version1.txt

      Line1 Line2 Line3 Line4

      Version2.txt

      Line1 Line3 Linenew Linenewagain Line4
      updated_line_number.pl Version1.txt Version2.txt 1 1 updated_line_number.pl Version1.txt Version2.txt 2 0 updated_line_number.pl Version1.txt Version2.txt 3 2 updated_line_number.pl Version1.txt Version2.txt 4 5

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1186417]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-24 04:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found