http://qs321.pair.com?node_id=776463

citromatik has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I have encountered a code like this in a production program I couldn't make run:

use strict; use warnings; my $ref_file = $ARGV[0]; if (isReadableFile ($ref_file)) { executeComm ("cat $ref_file"); } else { print STDERR "$ref_file Does not exist\n"; } ## In a different module... sub isReadableFile { my $file = shift; if (defined ($file) && # was a file name passed? ((-f $file) || (-l $file)) && # is the file a file or sym. link +? (-r $file) # is the file readable? ) { return 1; } else { return 0; } } sub executeComm { my ($comm) = @_; print "$comm\n"; system ($comm); print "$?\n"; }

A sample invocation should be something like:

$ perl test.pl file.txt

And if file.txt exists, is a regular file or links to a file and is readable, the command cat file.txt is executed inside the executeComm sub.

The problem arises when the file name or its path contains spaces:

$ perl test.pl file\ name.txt

This should be a valid invocation, but executeComm will receive the command cat file name.txt, and consequently, will fail. The same would happen if $ perl test.pl 'file name.txt' is passed, and perl test.pl 'file\ name.txt' would succeed, but the file tests on isReadableFile fail

A possible patch would be to escape every space after the file tests:

if (isReadableFile ($ref_file)) { $ref_file =~ s/ /\\ /g; ###### Added executeComm ("cat $ref_file"); } else { print STDERR "$ref_file Does not exist\n"; }

But this seems a weak patch... Is this solution portable? Do you anticipate the appearance of more problems?, how would you solve this in production code?

citromatik

Replies are listed 'Best First'.
Re: Passing commands to subroutines
by JavaFan (Canon) on Jul 01, 2009 at 15:42 UTC
    The problem will not only be spaces. Anything that is special to the shell will be a problem. To avoid such problems, don't use 1-arg system. Use multiple arg evocation.

    See the system manual page, and the perlipc document for details.

      This site has a pretty good discussion of these sorts of problems, and helped me in a similar situation.

      In particular this fix worked well for me and seemed fine when moving between Unix and windows (i didn't try anything else)...

      "Spaces could be transparently handled (no pun intended) with U+00A0, a non-breaking space, which in fact it is. Really. If the system is presented with a filename containing U+0020, it just replaces it unilaterally with U+00A0."

      Hope this helps

      UPDATE:

      This clearly isn't how i got round the problem too, as it doesn't get around the original problem (see example below)... Thanks JavaFan

      #!/usr/bin/perl use strict; use warnings; use encoding 'utf8'; my $dir = "/tmp/Foo"; unless (-e $dir){ mkdir $dir or die; } chdir $dir or die; opendir my $dh, $dir or die; my @files = readdir $dh; closedir $dh; print "Got ", scalar @files, " files\n"; foreach my $char ("\x{20}", "\x{A0}") { my $file = "foo{$char}bar"; open my $fh, ">", $file or die; } opendir $dh, $dir or die; @files = readdir $dh; closedir $dh; print "Got ", scalar @files, " files\n"; open my $fh, ">", 'foo baz' || die "Failed to open file : $!"; close $fh || die "Failed to close file : $!"; foreach my $char ("\x{20}", "\x{A0}") { my $file = "foo".$char."baz"; if (-e $file){ print "got it\n" } else { print "not got it...\n"; } } system("cat foo\x{20}baz"); system("cat foo\x{A0}baz");

      Just a something something...
        Replacing characters in file names is just plain wrong in Unix. A space is a space, and not something else. And something that isn't a space, just isn't.
        #!/usr/bin/perl use 5.010; use strict; use warnings; my $dir = "/tmp/Foo"; mkdir $dir or die; chdir $dir or die; opendir my $dh, $dir or die; my @files = readdir $dh; closedir $dh; say "Got ", scalar @files, " files"; foreach my $char ("\x{20}", "\x{A0}") { my $file = "foo{$char}bar"; open my $fh, ">", $file or die; } opendir $dh, $dir or die; @files = readdir $dh; closedir $dh; say "Got ", scalar @files, " files"; __END__ Got 2 files Got 4 files
        See, two different files - one with a space, the other with a non-breaking space. No automatic conversion between them.

      Hmmm... the original code redirects the system call:

      executeComm ("program $ref_file > $outfile");

      is there a way to use redirection in a system call using the multiple-arg version? this doesn't work: system ($program, $ref_file, ">",$outfile)

      citromatik

        Your whole reason to use the multiple argument version of system was to avoid having the your arguments treated as anything but literal text.

        And now you're asking why it doesn't work when the code does exactly that.

        If you want redirection, you'll either have to build a shell command or do it yourself (IPC::Run, IPC::Run3, IPC::Open2, IPC::Open3, etc.)

        You'd use fork and exec. Did you read the perlipc manual page I suggested? It contains code that does what you want which you can copy and adjust. I'm not going to cut and paste it for you.
        Given that you're facing a problem with tricky file names, the easiest substitute I could imagine for fixing a line of code like this:
        system("program $ref_file > $outfile");
        would be to do it like this:
        open( PROG, "-|", "program", $ref_file ) or die "can't launch 'program +' on $ref_file: $!\n" open( OUT, ">", $outfile ); while (<PROG>) { print OUT; } close PROG; close OUT;
        That ought to take any sort of goofy file name safely in stride (for both input and output files).
        Multiple-arg system() doesn't quite work like that (read this). Redirection using the shell metacharacter ">" is handled by the shell, which reads the entire string "program $ref_file > $outfile", parses it, and executes the command with redirection.

        When you do multiple-arg system, you're going raw and skipping the shell (usually). Redirections have to be performed manually and you'll lose some convenience you get with using a shell. Example:
        sub executeComm { my ($outfile, @comm) = @_; print "cmd: <", join("> <" => @comm), ">\n"; # manual redirection - dup(2) STDOUT first open(my $ORIGSTDOUT, ">&" . fileno(*STDOUT)) or die $!; open(*STDOUT, ">", $outfile) or die $!; # run it! my $exit = system @comm; # restore STDOUT open(*STDOUT, ">&=" . fileno($ORIGSTDOUT)) or die $!; print "exit: ", $exit, "\n"; } print "before\n"; executeComm($outfile, $program, $ref_file); print "after\n";

        If you really want a quick-and-dirty fix for your "whitespace-in-filename" problem, place single-quotes around the filenames, as in:
        executeComm ("program '$ref_file' > '$outfile'");

        Now make sure you don't have single-quotes in the filenames...
        Have you read 'perldoc -f system'?
        system "foo >bar";
        invokes the shell, and it is the shell which does redirection (> <)