Re: Newbie: uses/limits of perl in editing files
by tirwhan (Abbot) on Nov 23, 2007 at 14:07 UTC
|
Welcome to the monastery. From the task description I'd say perl is very well suited. I'll give you an example for a program that roughly does what you describe
#!/usr/bin/perl
use warnings;
use strict;
my $filename = "whateveryourfileiscalled.txt";
my $newfile = "whateveryouwantthechangedfiletobecalled.txt";
open (my $rfh,"<",$filename) or die "Can't open file $filename : $!";
open (my $wfh,">",$newfile) or die "Can't open file $newfile : $!";
while (my $line = <$rfh>) {
if ($line =~ m/^HEADER/) {
chomp $line;
my $number = 42; # change to whatever number you want to use
$line .= $number."\n";
}
if ($line =~ m/^REMARK/) {
print {$wfh} "Extra line\n" # Change to whatever extra line yo
+u want
}
print {$wfh} $line;
}
close $rfh or die "Can't close $filename : $!";
close $wfh or die "Can't close $newfile : $!";
Or you could do this in a perl oneliner (which will change the original file):
perl -pi -e 'chomp;s/^(HEADER.*)$/${1}42/;s/^(REMARK.*)$/Extra line\n$
+1/;$_.="\n"' whateveryourfileiscalled.txt
Caveat: Both of these are for systems where the line ending is "\n" (i.e. not Windows), adjust appropriately for other OSes. Update: the caveat is not actually correct, as pointed out by naikonta and wfsp, except possibly for the case outlined by Sixtease, also fixed in his update. Thanks to all of you.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Wow, I didn't know the print {$wfh} $line; construct. Does that disambiguate $wfh to be interpreted as a filehandle?
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
Yep, I picked that up from thedamians Perl Best Practices (a must-read for every Perl programmer IMO). As Fletch rightly points out, it's not necessary in this case, but I just use it wherever I print to a filehandle (easier to do than figure out what's wrong if I ever forget it :-)
| [reply] [Watch: Dir/Any] [d/l] |
|
Caveat: Both of these are for systems where the line ending is "\n" (i.e. not Windows), adjust appropriately for other OSes.
I see no caveat in your example regarding \n. This character is just the Perl internal representative of a thing that constitutes line ending. So it will be whatever the underlying OSes (perl is run on) actually use to terminate lines. See how newlines are addressed in perlport.
Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
Here's my guess:
Your file has Windows newlines (cr/lf), which your editor/viewer can deal with and shows it correctly. Then you add unix newlines (lf) on the lines you edit. Now there are mixed cr/lf and lf newlines, which confuses the editor and it shows the cr characters.
If I'm correct, then I recommend either preprocess the file with the dos2unix tool or address this in the perl script itself
update: The modified while loop could look like this:
while (my $line = <$rfh>) {
chomp $line;
if ($line =~ m/^HEADER/) {
my $number = 42; # change to whatever number you want to use
$line .= $number;
}
if ($line =~ m/^REMARK/) {
print {$wfh} "Extra line\n" # Change to whatever extra line yo
+u want
}
print {$wfh} $line, "\n";
}
| [reply] [Watch: Dir/Any] [d/l] |
|
Re: Newbie: uses/limits of perl in editing files
by Dominus (Parson) on Nov 23, 2007 at 15:06 UTC
|
Tie::File is nice for stuff like that.
It makes the file look like an array, with one line in each element. Then you modify the array. As you do, the changes appear in the file.
| [reply] [Watch: Dir/Any] |
Re: Newbie: uses/limits of perl in editing files
by Sixtease (Friar) on Nov 23, 2007 at 14:08 UTC
|
You can do this with pretty much any scripting / programming language (if it has input/output capabilities and is turing-complete). And Perl may be the most comfortable one for this.
The code to do something like you said could look like
perl -pe '/^REMARK/ and print "the line you want to add\n"' < input_file > output_file
| [reply] [Watch: Dir/Any] [d/l] |
Re: Newbie: uses/limits of perl in editing files
by cdarke (Prior) on Nov 23, 2007 at 14:42 UTC
|
Exetending your requirements a little, there is a neat feature that is useful when replacing tokens, like your HEADER and REMARK. You can execute code from within a substitute statement, for example:
$line =~ s/(HEADER|REMARK)/mysub($1)/ge;
That will call user-written subroutine mysub every time HEADER and REMARK are found in the text. The argument passed is the text matched inside (). Whatever is returned by mysub will replace the token. It probably would not be worth it for the simple substitution you mentioned, but for more complex combinations it can be very powerful. | [reply] [Watch: Dir/Any] [d/l] |
|
I was looking at those s/// but I wasn't sure I how I could get it to do some of the things I need as the text which has to be substitued is different from file to file and it also appears elswhere in the file, where it's not to be adjusted.
Was that clear?
This is a theoretical line is my file:
BOBBY X66666 A 345 674 A 123 488
The X66666 has to be changed to B22222. But the next file might have U33333 there instead of X66666, or worse still 666D3P, or even worse absolutely nothing at all. And I don't know what it might be unless I open up each of the text files and look what the previous program did to it (something I'm trying to avoid by learning this!). It SHOULD be that the spacing is constant across that line, but that's not guaranteed.
Anyway, that's some of what I'm trying to do. Thanks everyone for the speed and friendliness in helping me out!
| [reply] [Watch: Dir/Any] |
|
s/^(\S+)(\s+)(\S+)(.*)/${1}${2}B22222${4}/;
This reads like:
(NOT WHITESPACE)(WHITESPACE)(NOT WHITESPACE)(EVERYTHING)
That collects the first 3 pieces into variables $1-$3 then the remainder of the line into $4, then reassemblies the line with the pieces and the replacement.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Newbie: uses/limits of perl in editing files
by dwm042 (Priest) on Nov 23, 2007 at 14:57 UTC
|
Perl is as close to a Swiss Army knife of a scripting language as exists. If you can't write the code yourself, you can, in most circumstance, find a solution written for you on CPAN.
Having said that, at this stage, you probably need a beginning text on writing Perl. Something like Learning Perl would be appropriate. | [reply] [Watch: Dir/Any] |