http://qs321.pair.com?node_id=604565


in reply to Re^2: Search and delete lines based on string matching
in thread Search and delete lines based on string matching

Here's an approach that initializes a hash of output tokens, prepopulated with marker text for new additions. The output data is overridden for existing entries, and a sorted list is appended to the specified output file (open with '>' instead of '>>' if you don't want this). Note that it isn't fully compatible with the delete code I posted earlier, since that code didn't take comments into account.
#!/usr/local/bin/perl use strict; use warnings; if (@ARGV != 3) { print "Usage: $0 <pattern file> <input file> <output file>\n"; exit; } my ($pattern_filename, $source_filename, $dest_filename) = @ARGV; open my $pattern_fh, '<', $pattern_filename or die "Failed to open $pa +ttern_filename: $!"; my %output_tokens = (); while (my $line = <$pattern_fh>) { chomp $line; $output_tokens{$line} = "$line # Added by script"; } print "Expected tokens: ", join(', ', keys %output_tokens), "\n"; open my $infile, "<", $source_filename or die "Failed to open $source +_filename: $!"; open my $outfile,">>", $dest_filename or die "Failed to open $dest_f +ilename: $!"; while(my $line = <$infile>) { chomp $line; $output_tokens{$line} = $line; } for my $token (sort keys %output_tokens) { print $output_tokens{$token}, "\n"; } close($infile); close($outfile);

Replies are listed 'Best First'.
Re^4: Search and delete lines based on string matching
by brut (Initiate) on Mar 13, 2007 at 15:26 UTC
    Hey The very first code you provided me with is really working fine for no +rmal strings without any non-alphabet characters except for its not w +orking for strings which are like A[0] B[1] So please resolve this .. I tried using the fix you gave but its not w +orking.So need to modify the code for these cases of strings also. Rest everything is really fine. And real thanks for all the innovative solutions you have provided.You + really are excellent.
      It seems to work for me, unless I misunderstood your specs. Here are the files I am using:
      patterns.txt
      A[0] C D
      infile.txt
      A[0] B C D[0] D1 DA
      brut.pl
      #!/usr/local/bin/perl use strict; use warnings; if (@ARGV != 3) { print "Usage: $0 <pattern file> <input file> <output file>\n"; exit; } my ($pattern_filename, $source_filename, $dest_filename) = @ARGV; open my $pattern_fh, '<', $pattern_filename or die "Failed to open $pa +ttern_filename: $!"; my @tokens = (); while (my $line = <$pattern_fh>) { chomp $line; push @tokens, quotemeta($line); } my $pattern = '^(?:' . join('|', @tokens) . ')[^a-zA-Z]*$'; print "Search pattern: $pattern\n"; open my $infile, "<", $source_filename or die "Failed to open $source +_filename: $!"; open my $outfile,">>", $dest_filename or die "Failed to open $dest_f +ilename: $!"; while(my $line = <$infile>) { print "input : $line"; if ($line =~ /$pattern/) { next; } print "output: $line"; print $outfile $line; } close($infile); close($outfile);
      perl brut.pl patterns.txt infile.txt outfile.txt
      Search pattern: ^(?:A\[0\]|C|D)[^a-zA-Z]*$ input : A[0] input : B output: B input : C input : D[0] input : D1 input : DA output: DA
        Yes it's fine except for the output should also have D[0] D1 DA as I need exact matching i.e. only D in input will not delete D1 and D +A from output. And in the addition code, if I dont want the new outfile to be sorted. +And it's a simple appending the strings whatever they be without any +string matching of file B with file A. So in the addition requirement if A contains B C F[0] and B contains D[0] E output should be D[0] E B C F[0]