Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Find and replace based on unique identifier

by oryan (Initiate)
on Aug 20, 2018 at 17:04 UTC ( [id://1220734]=perlquestion: print w/replies, xml ) Need Help??

oryan has asked for the wisdom of the Perl Monks concerning the following question:

I need to find and replace 2 lines of code related to culverts in a model text file. I have a model with all the lines in the right order with the OLD culvert values, and another text file with the NEW culvert values but in the wrong order. What the script does currently:

1. In the model, finds line beginning with text "Connection Culv". This is the line of text I need to replace.
2. Finds the next line after "Connection Culv" that starts with "Conn Culvert Barrel" - this is the unique identifier for the replacement.
3. Pulls new values of "Connection Culv" from text file and replaces them in model.
4. Repeats for all instances of Connection Culv and then saves new file.

Instead of ONLY replacing the line that begins with "Connection Culv" I need it to replace that line and the following line ( 111 111, 222 222, etc), but I can't get it to work.

EXAMPLE:

MODEL IN CORRECT ORDER:
Connection Culv=This is Line1 111 111 Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 222 222 Conn Culvert Barrel=Culvert2 * Connection Culv=This is Line3 333 333 Conn Culvert Barrel=Culvert3 *

REPLACEMENT TEXT FILE:
Connection Culv=This is Line3 - New text here 333 333 This should be new too Conn Culvert Barrel=Culvert3 * Connection Culv=This is Line1 - New text here 111 111 This should be new too Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 This should be new too Conn Culvert Barrel=Culvert2 *

CURRENT RESULT:
Connection Culv=This is Line1 - New text here 111 111 Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 Conn Culvert Barrel=Culvert2 * Connection Culv=This is Line3 - New text here 333 333 Conn Culvert Barrel=Culvert3 *

NEEDED RESULT:
Connection Culv=This is Line1 - New text here 111 111 This should be new too Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 This should be new too Conn Culvert Barrel=Culvert2 * Connection Culv=This is Line3 - New text here 333 333 This should be new too Conn Culvert Barrel=Culvert3 *

There are hundreds of these replacements that need to be made throughout the model. I feel like this should be simple, but nothing has worked. Here is the code I have currently that works for replacing the single line "Connection Culv" but not the following line. Any help is appreciated. Thanks.

# HEC-RAS Replacement Perl Script # This will find and replace values in the HEC-RAS geometry file for m +odified culvet barrels. The process is: # 1. In existing model file (HECRAS_Ex.txt) find where there is a " +Connection Culv" (this is the line that needs to be replaced) # 2. It then down for the next Conn Culvert Barrel line (this is th +e unique identifier) # 3. It then takes from the new culvert file (culvNEW.txt) the new +"Connection Culv" line and replaces it in the existing HECRAS_Ex.txt +file. # 4. Repeats for all and then saves out Output_HECRAS.txt # Nomenclature for running Perl Script: # C:\MyDir> perl PERL_SCRIPT.pl culvNEW.txt HECRAS_Ex.txt OutPut_H +ECRAS.txt # Read Existing HEC-RAS Geometry File (HECRAS_Ex) with Old Culvert Con +nection Attributes open (TEMPLATE, @ARGV[1]) or die; @HECRAS_Ex = <TEMPLATE>; close TEMPLATE; # Read New Culvert Data File (culvNEW) with new Connection culvert Att +ributes open (TEMPLATE, @ARGV[0]) or die; @culvNEW = <TEMPLATE>; close TEMPLATE; for ($i=0; $i<@HECRAS_Ex; $i++) { # only check lines starting with "Connection Culv" in the HECRAS_Ex fi +le if ($HECRAS_Ex[$i] =~ /^Connection Culv/) { #print $HECRAS_Ex[$i]; #look for Connection Culv backwards $iback=$i-1; while ($HECRAS_Ex[$iback] !~ /^Conn Culvert Barrel/) { $iback=$iback+1; } $local0=$HECRAS_Ex[$iback]; chomp($local0); # print $HECRAS_Ex[$iback]; for ($j=0; $j<@culvNEW; $j++) { # for ($j=0; $j<1; $j++) { $local = $culvNEW[$j]; chomp($local); # print $local; # Remove the trailing new line # chomp $local; # print ($local eq $HECRAS_Ex[$iback]); if ($local =~ /^$local0/) { # print "match"; $jforward=$j-1; while ($culvNEW[$jforward] !~ /^Connection Culv/) { $jforward=$jforward-1; } # print $culvNEW[$jforward]; # Perform substitutions of LG card $HECRAS_Ex[$i]=$culvNEW[$jforward]; # print $HECRAS_Ex[$i]; } } } } #write out the Geometry File based on the HECRAS_Ex file structure and + the new values in the culvNEW file open (OUT, ">" . @ARGV[2]) or die; # Write output print OUT @HECRAS_Ex; # Close OUT close OUT;

Replies are listed 'Best First'.
Re: Find and replace based on unique identifier
by tybalt89 (Monsignor) on Aug 20, 2018 at 21:04 UTC
    #!/usr/bin/perl # https://perlmonks.org/?node_id=1220734 use strict; use warnings; my $new = <<END; Connection Culv=This is Line3 - New text here 333 333 This should be new too Conn Culvert Barrel=Culvert3 * Connection Culv=This is Line1 - New text here 111 111 This should be new too Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 This should be new too Conn Culvert Barrel=Culvert2 * END my $model = <<END; Connection Culv=This is Line1 111 111 Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 222 222 Conn Culvert Barrel=Culvert2 * Connection Culv=This is Line3 333 333 Conn Culvert Barrel=Culvert3 * END my %replace; $replace{$2} = $1 while $new =~ /^Connection Culv=(.*\n.*\n)(Conn Culvert Barrel=.*\S)/gm; $model =~ s/^Connection Culv=\K.*\n.*\n(?=(Conn Culvert Barrel=.*\S))/ +$replace{$1}/gm; print $model;

    Outputs:

    Connection Culv=This is Line1 - New text here 111 111 This should be new too Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 This should be new too Conn Culvert Barrel=Culvert2 * Connection Culv=This is Line3 - New text here 333 333 This should be new too Conn Culvert Barrel=Culvert3
Re: Find and replace based on unique identifier
by Cristoforo (Curate) on Aug 20, 2018 at 19:22 UTC
    Hi oryan,

    Here is a solution that prints out the new file in correct order and doesn't use the existing file at all. It sets the input record separator to $/ = "*\n" and reads in 'chunks' of the file. Also, I found a trailing space at the end of each line which I removed prior to parsing the file. If your data does have these trailing spaces, then they would need to be accounted for. I'm assuming they aren't in the 'new' file, but got there by copy and paste.

    The solution uses a Schwartzian transform to process the blocks.

    #!/usr/bin/perl use strict; use warnings; open my $new, '<', \<<EOF; Connection Culv=This is Line3 - New text here 333 333 This should be new too Conn Culvert Barrel=Culvert3 * Connection Culv=This is Line1 - New text here 111 111 This should be new too Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 This should be new too Conn Culvert Barrel=Culvert2 * EOF $/ = "*\n"; print map {$_->[0]} sort {$a->[1] <=> $b->[1]} map {[$_, /^Conn Culvert Barrel=Culvert(\d+)$/m]} <$new>;
    Output was:
    Connection Culv=This is Line1 - New text here 111 111 This should be new too Conn Culvert Barrel=Culvert1 * Connection Culv=This is Line2 - New text here 222 222 This should be new too Conn Culvert Barrel=Culvert2 * Connection Culv=This is Line3 - New text here 333 333 This should be new too Conn Culvert Barrel=Culvert3 *

      Thank you for the reply. I provided a very simplified, striped down version of the model itself. There are thousands of lines of code and there are hundreds of these lines that I need to replace scattered throughout the model, so unfortunately sorting is not an option. Sorry if I was not clear on that in my initial post. Here is a small example of the actual model itself - there are 3 examples of the culvert line that I would need to replace in this snipit (#1. Conn Culvert Barrel=1,Cul1,0, #2. Conn Culvert Barrel=2,Cul1,0, #3. Conn Culvert Barrel=1,Cul2,0). Thanks again for any help:

      BreakLine Name=BL1 BreakLine CellSize Min= BreakLine CellSize Max= BreakLine Near Repeats=1 BreakLine Polyline= 2 749229.01307448720208.606197753 749310.99413914719981.581711002 Connection=Connection1 ,749170.3458749,719393.4466371 Connection Desc= Connection Line=4 749264.197132839719397.961390208749158.302314896719392.527851194 749148.302367149719392.560178611 749076.48011895719389.280834652 Connection Last Edited Time=Aug/17/2018 12:55:44 Conn Near Repeats=1 Connection Up Reach= Connection Up River= Connection Up RS= Connection Dn Reach= Connection Dn River= Connection Dn RS= Conn Routing Type= 1 Conn Use RC Family=False Conn OverFlow Method 2D=False Conn Weir WD=10 Conn Weir Coef=3 Conn Weir Is Ogee= 0 Conn Simple Spill Pos Coef=0.05 Conn Simple Spill Neg Coef=0.05 Conn Weir SE= 18 01500.45630.77757 1500.7940.786091500.95655.810461501.13160.817 +041501.216 65.823611501.27975.836771501.38685.849921501.55495.863081501.659100.86 +971501.775 116.03411502.304120.88151502.426135.89721502.754140.90241502.841152.95 +841503.119 179.7221503.689185.9493 1503.88187.93121503.923 Connection Culv=2,3,10,100.69,0.02,0.3,1,8,1,1002,1001, 1 ,Group1B + , 0 , 100 100 Conn Culvert Barrel=1,Cul1,0 Conn Culv Bottom n=0.02 Connection Culv=1,3,,100.69,0.02,0.3,1,2,3,1000.01,999.01, 1 ,Group1A + , 0 , 115 115 Conn Culvert Barrel=2,Cul1,0 Conn Culv Bottom n=0.02 Conn Outlet Rating Curve= 0 ,False,, Connection=Connection2 ,749150.8027089,720183.502415 Connection Desc= Connection Line=3 749212.113799506720183.551667985749148.471637979720183.975109164 749089.500952284720182.466598541 Connection Last Edited Time=Aug/17/2018 12:56:39 Conn Near Repeats=1 Connection Up Reach= Connection Up River= Connection Up RS= Connection Dn Reach= Connection Dn River= Connection Dn RS= Conn Routing Type= 1 Conn Use RC Family=False Conn OverFlow Method 2D=False Conn Weir WD=10 Conn Weir Coef=3 Conn Weir Is Ogee= 0 Conn Simple Spill Pos Coef=0.05 Conn Simple Spill Neg Coef=0.05 Conn Weir SE= 15 01505.09818.654211505.91923.654321506.17528.654431506.16133.654 +541505.944 58.65511505.82883.661761505.80188.66339 1505.7493.665021505.73598.666 +661505.694 103.66831505.722108.66991505.781113.67161505.877118.67321506.027122.63 +35 1506.19 Connection Culv=4,3,4,79.64,0.02,0.3,1,29,2,1505,1504.5, 1 ,Group2 + , 0 , 50 50 Conn Culvert Barrel=1,Cul2,2 749148.417384509720223.797157772749148.525891449720144.153060555 Conn Culv Bottom n=0.02 Conn Outlet Rating Curve= 0 ,False,, LCMann Time=Dec/30/1899 00:00:00 LCMann Region Time=Dec/30/1899 00:00:00 LCMann Table=0 Chan Stop Cuts=-1 Use User Specified Reach Order=0 GIS Ratio Cuts To Invert=-1 GIS Limit At Bridges=0 Composite Channel Slope=5

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1220734]
Approved by Corion
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-19 11:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found