Compare two file text input, compare it, replace and write new file

wa2nlinux has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I got problems in perl. I have 2 text file call it 1) data 2) source.txt The data file contain teks file like this :

\tl:&#xa12
\tr:&#xa13
\ca:&#xa15
\ra:&#xa16
\ka:&#xa17
...
[download]

Data file contain two colomn "key" and "values" separated by ":" and the source.txt contain

\tl\ca\ra\tr\ka ...
[download]

source.txt file contain "key" such as in data file. I need a new file call it target.txt that contain "values" form data according to "keys" in source.txt

&#xa12&#xa15&#xa16&#xa13&#xa17
[download]

#!/usr/bin/perl

use strict;
use warnings;

my $dfile = 'data';
my $sfile = 'source.txt';
my $tfile = 'target.txt';
open (DFILE,$dfile) or die "can not open";
open (SFILE,$sfile) or die "can not open";
open (TFILE,"> $tfile") or die "can not open";

my %words;
while (<DFILE>) {
        chomp;
        my ($key, $val) = split /:/;
        $words{$key} .= exists $words{$key} ? "$val" : $val;
};
while (my $s = <SFILE>) {
chomp($s);
my @words = split / /, $s;
foreach my $val (@words) {
  }
   for my $i (0 .. $#words) {
     $words[$i] = $words{$words[$i]} if (exists($words{$words[$i]}))
        }
   print TFILE join(' ', @words),$/;
   print TFILE "<br>";
}


close(SFILE);
close(DFILE);
close(TFILE);
[download]

The code above work if only if the source.txt contain keys separated with space,

\ca \ra \ka \tl
[download]

but fail to work if source.txt is

\ca\ra\ka\tl
[download]

I try to make hash from data but I confuse how to compare it and replace it, if source.txt contain "keys" without space

Comment on Compare two file text input, compare it, replace and write new file Select or Download Code

Replies are listed 'Best First'.
Re: Compare two file text input, compare it, replace and write new file by kcott (Archbishop) on Feb 14, 2012 at 05:23 UTC
To populate your hash, change `$words{$key} .= exists $words{$key} ? "$val" : $val;` [download] to `$words{$key} = $val;` [download] -- Ken	[reply] [d/l] [select]
Re: Compare two file text input, compare it, replace and write new file by repellent (Priest) on Feb 14, 2012 at 06:29 UTC
You are splitting your `source.txt` lines with a single whitespace: `my @words = split / /, $s;` [download] For your code to work, source lines have to look like keys separated by single whitespaces. Hmm.. it seems like split may not be the right tool, since it's hard to specify what to really split on. Try regex search and replace! `# match a backslash followed by two characters, # then compute the replacement by looking into %words $s =~ s{(\\..)}{ $words{$1} \|\| $1 }ge;` [download] Have a look at perlrequick and perlretut. (And follow kcott's advice.)	[reply] [d/l] [select]
Re^2: Compare two file text input, compare it, replace and write new file by wa2nlinux (Novice) on Feb 14, 2012 at 23:30 UTC
the text is already formated, such as `\ha\hang\hu\hung` [download] can I using search replace if the format like in the code above ? because the codes sometime having same first or second characters such as `\ha` and `\hang` My code already work but if I manually add a space after the code `\ha\ca\ra\ka` to `\ha \ca \ra \ka`	[reply] [d/l] [select]
Re: Compare two file text input, compare it, replace and write new file by jwkrahn (Abbot) on Feb 14, 2012 at 07:28 UTC
`open (DFILE,$dfile) or die "can not open"; open (SFILE,$sfile) or die "can not open"; open (TFILE,"> $tfile") or die "can not open";` [download] That would be better as: `open DFILE, '<', $dfile or die "can not open '$dfile' because: $!"; open SFILE, '<', $sfile or die "can not open '$sfile' because: $!"; open TFILE, '>', $tfile or die "can not open '$tfile' because: $!";` [download] `$words{$key} .= exists $words{$key} ? "$val" : $val;` [download] That would be better as: `$words{$key} .= $val;` [download] `while (my $s = <SFILE>) { chomp($s); my @words = split / /, $s;` [download] That would be better as: `while ( <SFILE> ) { my @words = split;` [download] `for my $i (0 .. $#words) { $words[$i] = $words{$words[$i]} if (exists($words{$words[$i]})) }` [download] That would be better as: `for my $word ( @words ) { $word = $words{ $word } if exists $words{ $word }; }` [download] `print TFILE join(' ', @words),$/; print TFILE "<br>";` [download] That would be better as: `print TFILE "@words\n<br>";` [download] The code above work if only if the source.txt contain keys separated with space, `\ca \ra \ka \tl` [download] but fail to work if source.txt is `\ca\ra\ka\tl` [download] `$ perl -le'$_ = q/\ca\ra\ka\tl/; print; print for split /(?=\\)/' \ca\ra\ka\tl \ca \ra \ka \tl` [download]	[reply] [d/l] [select]
Re^2: Compare two file text input, compare it, replace and write new file by wa2nlinux (Novice) on Feb 15, 2012 at 01:55 UTC
OK the newest code : #!/usr/bin/perl use strict; use warnings; my $dfile = 'data'; my $sfile = 'source.txt'; my $tfile = 'target.html'; open DFILE, '<', $dfile or die "can not open '$dfile' because: $!"; open SFILE, '<', $sfile or die "can not open '$sfile' because: $!"; open TFILE, '>', $tfile or die "can not open '$tfile' because: $!"; my %words; while (<DFILE>) { chomp; my ($key, $val) = split /:/; $words{$key} = $val; }; while ( <SFILE> ) { my @words = split /(?=\\)/; for my $word ( @words ) { $word = $words{ $word } if exists $words{ $word }; } print TFILE "@words\n<br>"; } close(SFILE); close(DFILE); close(TFILE); [download] having source.txt : `\ca\ra\ka\tl \ca\ra\ka\tl` [download] the last character of line (\tl) not replace by value on data. The result in target.html is : `&#xa15 &#xa16 &#xa17 \tl <br>&#xa15 &#xa916 &#xa17 \tl` [download]	[reply] [d/l] [select]
Re^3: Compare two file text input, compare it, replace and write new file by wa2nlinux (Novice) on Feb 15, 2012 at 13:12 UTC
Solved by adding chomp `while ( <SFILE> ) { chomp my @words = split /(?=\\)/; for my $word ( @words ) { $word = $words{ $word } if exists $words{ $word }; } print TFILE "@words\n<br>"; }` [download] another question: is it OK, if I used hash array my "data" (DFILE) on that program contain about 5000 lines ? `my %words; while (<DFILE>) { chomp; my ($key, $val) = split /:/; $words{$key} = $val; };` [download]	[reply] [d/l] [select]


Just another Perl shrine
	PerlMonks