Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re: Immediately writing the results of search-and-replace

by kcott (Archbishop)
on Aug 06, 2022 at 04:30 UTC ( [id://11145980]=note: print w/replies, xml ) Need Help??

in reply to Immediately writing the results of search-and-replace

This looked like an interesting problem to get the mental juices flowing on a Saturday morning. I see you've already received much help so, rather than trying to modify your code, I decided to write it from scratch. A few notes on some the things I did follow; ask if you want to know about anything else.

  • The autodie pragma handles I/O exceptions so I don't have to worry about doing that. I note that you're not checking any I/O operations.
  • I used constants to specify the indices of the search and replace elements in the CSV file. If the CSV format ever changes, you'll probably only need to change one or two numbers; as opposed to recoding splice() statements in a couple of places.
  • I used IO::Prompter for the choice of replacement text where more than one option existed. That used a lot less code than what's curently in search_and_replace() and has many more features. Options are presented in the form of a menu and you only need to hit one key (i.e. no newline) to make a selection. I've also used colour: if you want to use that you may need to fiddle with it a bit (it looks good on my black background). That uses Term::ANSIColor; I've added extra colour in a few places, such as highlighting the text to be replaced.
  • Text::CSV will use Text::CSV_XS if it's installed; but the code will still work if it's not.
  • The code is modular and pretty much all the work is done in three, short subroutines: _process_csv(), _process_text() and _replace(). Also note that those routines only use variables that are passed to them: there's no action-at-a-distance, or similar problems, from global or file-scoped variables.


#!/usr/bin/env perl use strict; use warnings; use autodie; use constant { SEARCH => 0, REPLACE => 2, }; use File::Copy; use List::Util 'uniq'; use IO::Prompter [ -single, -style => 'bold blue', -echostyle => 'bold magenta', ]; use Term::ANSIColor; use Text::CSV; # For testing: BEGIN { copy('whitelist_sr_original.txt', 'whitelist_sr.txt'); } my $csv_file = 'whitelist_sr.csv'; my $source_file = 'whitelist_sr_src.txt'; my $text_file = 'whitelist_sr.txt'; copy($text_file, $source_file); my ($sr, $re) = _process_csv($csv_file); _process_text($source_file, $text_file, $sr, $re); unlink $source_file; sub _process_text { my ($source_file, $text_file, $sr, $re) = @_; open my $src_fh, '<', $source_file; open my $txt_fh, '>', $text_file; while (my $line = <$src_fh>) { while ($line =~ /$re/g) { my $frag = $1; $line =~ s/$frag/_replace($line, $frag, $sr, $re)/e; } print $txt_fh $line; } return; } sub _replace { my ($line, $frag, $sr, $re) = @_; my $replace; if (@{$sr->{$frag}} == 1) { $replace = $sr->{$frag}[0]; } else { $line =~ /$re/; $line = colored(substr($line, 0, $-[0]), 'dark yellow') . colored($1, 'bold yellow') . colored(substr($line, $+[0]), 'dark yellow'); print colored("Current line:", 'bold white'), ' ', $line; my $index = prompt "Replace '$frag' with:", -number, -menu => { map +($sr->{$frag}[$_] => $_), 0 .. $#{$sr->{$frag} +} }, '> '; $replace = $sr->{$frag}[$index]; } return $replace; } sub _process_csv { my ($csv_file) = @_; my $sr = {}; my $csv = Text::CSV::->new(); open my $fh, '<', $csv_file; while (my $row = $csv->getline($fh)) { push @{$sr->{$row->[SEARCH]}}, $row->[REPLACE]; } $sr->{$_} = [ uniq @{$sr->{$_}} ] for keys %$sr; my $alt = join '|', map "\Q$_", sort { length $b <=> length $a } keys %$sr; my $re = qr{\b($alt)\b}; return ($sr, $re); }

I extended your CSV file to enable more testing. It's in the spoiler:

I also created my own text file for testing. Again, in the spoiler:

Here's a rough representation of part of a run, showing dynamic changes to a line as a number of matches are found and the selected substitutions are made:

Current line: whitelist whitelist whitelist whitelist
Replace 'whitelist' with:
   1. access list
   2. access-list
   3. allow

> 2
Current line: access-list whitelist whitelist whitelist
Replace 'whitelist' with:
   1. access list
   2. access-list
   3. allow

> 3
Current line: access-list allow whitelist whitelist
Replace 'whitelist' with:
   1. access list
   2. access-list
   3. allow

> 1
Current line: access-list allow access list whitelist
Replace 'whitelist' with:
   1. access list
   2. access-list
   3. allow

> 2

Here's the final result after that full run. Spoiler again:

— Ken

Replies are listed 'Best First'.
Re^2: Immediately writing the results of search-and-replace
by elemkeh (Acolyte) on Aug 09, 2022 at 14:37 UTC
    Your code is far more elegant than what I had written originally; I will be reading it carefully to add it to my understanding and I'm sure it will improve my Perl efforts going forward. Thank you.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11145980]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-04-23 01:20 GMT
Find Nodes?
    Voting Booth?

    No recent polls found