Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

How would I write this multiple substitution script?

by Carnival (Novice)
on Oct 13, 2011 at 09:14 UTC ( [id://931176]=perlquestion: print w/replies, xml ) Need Help??

Carnival has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I'm a novice Perl writer in need of a bit of help.

I need to carry out several substitutions, on the contents of several files, in several directories.. if that makes sense!

Currently I have all my substitutions as one-liners which execute on the current working directory, but I would like to combine them all into a single clean script.

Simple explanation for what I would like to do:
chdir to dir1 for contents of each file in dir1 do s/blue/red/; s/yellow/green/; close dir1 chdir to dir2 for contents of each file in dir2 do s/purple/orange/; close dir2 chdir to dir3 etc..

How would I do this? I have a test script below, the first part of which successfully chdirs to /attach and changes the filenames to lowercase. The second part is one of the scripts to change the contents of each file in a dir. I'm not sure if I'm headed in the right direction or not (or if any of this is salvageable?).

#!/usr/bin/perl use strict; use warnings; use File::Copy; ### Convert all filenames within /attach subdir to lowercase ### my $attchdir = '/attach'; chdir($attchdir) or die "Can't chdir to $attchdir $!"; opendir(ATTCHDIR, $attchdir) || die "Couldn't opendir: $!\n"; my @attchfiles = grep { $_ ne '.' && $_ ne '..' } readdir ATTCHDIR; foreach(@attchfiles) { my $attchfilename = $_; $attchfilename =~ tr/A-Z/a-z/; rename($_, $attchfilename); } closedir(ATTCHDIR); ### ### my $dir = '/html'; opendir(DIR, $dir) or die "Cannot open directory: $!\n"; my @files = readdir(DIR); closedir(DIR); foreach(@files) { my $filename = $_; open(FILE, $filename); while (my $data = <FILE>) { #print $data; $data =~ s/red/blue/g; } close(FILE); }

Replies are listed 'Best First'.
Re: How would I write this multiple substitution script?
by moritz (Cardinal) on Oct 13, 2011 at 09:31 UTC

    The substitution won't work that way, because you opened the file just for reading. So while you change $data the way you want it, you don't write it back into a file in some way.

    An easy way to do an in-place substitution is the use of the -i flag on the command line:

    perl -i.orig -pwe 's/red/blue/g' /html/*

    If you don't want to use a one-liner (or -i on the shebang line), you have to open a second file for writing, write the modified the contents to it, delete the source file and then move the file with the modified contents to the original location.

Re: How would I write this multiple substitution script?
by bart (Canon) on Oct 13, 2011 at 09:33 UTC
    You may have overlooked the fact that readdir returns the basename of the files, and as you don't chdir to that directory (like you described in your description), you have to include that directory in the file path for open.

    Also, readdir returns both plain files and directories, and you don't check for that. The file list may (depending on the platform) also include "." and ".."! Check with -f to see if it's indeed a file before you try to open it — again, don't forget about the directory.

    It might be a little easier to use glob to get the file list, for which the file names will include the directory name. Plus, you can provide a pattern for the file name. Also, glob will not return results for "." or "..", whatever the pattern. (Though, on Unix it might if the pattern for the basename starts with "."; I'm not sure.)

    As an optimization for writing code: you could pull out the actual file processing (the sequence of s/// statements) into separate subs, and pass a sub reference to a sub which does the boring stuff, which can be common for all directories and all files. Something like:

    process($dir1, sub { s/blue/red/; s/yellow/green/; }); sub process { my $dir = shift; my $sub = shift; ... # pretty much like your own code, until we have to process the + file contents: $sub->(); # note that the file contents is expected to be in $_ ... # rest of the wrapper code }
Re: How would I write this multiple substitution script?
by fisher (Priest) on Oct 13, 2011 at 09:26 UTC
    Why don't you test it on a test directories with test files?

    Yes, you headed in a right direction. In the first directory you rename files and in the second you substitute contents of a files.

    well, hint.

    foreach $filename (@files) { open FILEIN, "<", $filename; open FILEOUT, ">", $filename."new"; while (my $data = <FILEIN>) { $data =~ s/red/blue/g; print FILEOUT $data; } close FILEIN; close FILEOUT; }
    ...and then rename.
Re: How would I write this multiple substitution script?
by Carnival (Novice) on Oct 13, 2011 at 10:44 UTC
    Perfect explanations, and so quickly! I'm used to one-liners and so I didn't even think that I may be making the changes but not outputting the results somewhere. Much appreciated, thank you. I'm glad that I was at least facing in the right direction :)
Re: How would I write this multiple substitution script?
by hbm (Hermit) on Oct 13, 2011 at 15:08 UTC

    Twice you do something like this:

    foreach(@array){ my $item = $_; ...

    The common way is just for my $item (@array) {.

    Or take advantage of the default loop variable, $_. For example, this:

    foreach(@attchfiles) { my $attchfilename = $_; $attchfilename =~ tr/A-Z/a-z/; rename($_, $attchfilename); }

    Could be simplified as:

    rename($_,lc) for @attchfiles;

    But, you should probably make sure the lowercase file doesn't already exist. Perhaps like so:

    for(@attchfiles){ if ( -e lc ) { # do something appropriate! } else { rename $_,lc; } }
Re: How would I write this multiple substitution script?
by TomDLux (Vicar) on Oct 13, 2011 at 19:35 UTC

    You want to do a number of substitutions to each file in a directory, with different substitutions for each of a large number of directories.

    I would suggest putting the directories and associated substitutions into a data structure. It's a little simpler to put at the top of the file, but harder to modify. Reading in a YAML file containing the rules is a little more complicated to get working, but simpler to maintain over the years, if this is something that will be used for a long time.

    Since you're a novice, I'll suggest putting the data in the file:

    use autodie; use File::Temp qw(tempfile); my @RULES = ( { dir => '/path/to/dir1', mods => [ 'yellow', 'green' ], [ 'red', 'blue' ] ], }, { dir => '/path/to/dir2', mods => [ [ 'puce', 'chartreuse' ], [ 'lime green', 'ma +genta' ] ], }, ); ... for my $rule ( @RULES ) { chdir $rule->{dir}; for my $file ( glob( .* * ) ) { open my $input, '<', $file; my ($output, $outfile) = tempfile(); while ( my $line = <$input> ) { chomp $line; for my $sub ( @{ $rule->{mods} } ) { $line =~ s/$sub->[0]/$sub->[1]/ } print $output $line; } close $input; close $output; replace_file( $file, $outfile ); } }

    I leave for you the process of verifying the old file is successfully replaced by the new file. The regexes will be recompiled on each search-and-replace, but if they are really literal strings, that won't be too expensive. If there were matches, I would suggest precompiling the exporessions, but I'm not sure that works for search-and-replace.

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.

Re: How would I write this multiple substitution script?
by Carnival (Novice) on Oct 14, 2011 at 07:57 UTC

    Thank you very much everyone, my script is now fully functional (after 3 days of wanting to cry).

    I really appreciate all your suggestions and criticism though, you gave me a lot of food for thought - now that it is functional I can concentrate on cleaning it up a bit.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://931176]
Approved by fisher
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-19 13:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found