Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

How to replace spaces with different chars?

by ovedpo15 (Pilgrim)
on Jul 06, 2022 at 14:32 UTC ( #11145308=perlquestion: print w/replies, xml ) Need Help??

ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
I have a list of paths. There is a bug in the utility which reported those paths. The bug is that some of the special chars are being replaced with spaces but we don't know which.
The list of special chars:
my @special_chars = (":", ";", ",", "=", "-");
Given that list of paths and the array of special chars, I want to report all existing possible paths. Kind of brute-force on all the options. So, as I understand, the algorithm is to replace each space with one of the chars and check if that path exists. If so, report it.
If there was only one special char, I would just use one foreach loop on the chars and replace all the spaces in the path with the special char and check if the path exists using -e. But since there are multiple chars, I'm not sure how to do it. I don't want to have 5 different loops.
Some examples:
/a/b/c/d/e/fi:le /a/b/c/d/e/fi:l;e /a/b/c/d/e/f-i:l;e /a/b/c/d/e/f-i-:l;e /a/b/c/d/e/f:i-:l;e
In that case, the list that you get, looks like:
/a/b/c/d/e/fi le /a/b/c/d/e/fi l e /a/b/c/d/e/f i l e /a/b/c/d/e/f i l e
And the previous list is what you should build.
Note that there could be multiple spaces one by one and the special chars to fill them, could be different.
It feels like a question from "introduction to CS" but I can't seem to figure this one. How can it be done?

EDIT: Thanks all for the suggested implementations. I now understand that my current suggested algorithm to just fill and detect all the paths, does not work.
How can I use the system instead? For example, given a path with spaces, try to see if there is a matched path with some filled chars (does not have to be the special ones). If there are paths, then get only those paths that were filled with the special chars - should it be faster? How it can be done?
For example, for /a/b/c/d/e/f i l e it will find:
/a/b/c/d/e/f-i:l;e /a/b/c/d/e/fai:l;e /a/b/c/d/e/fai:lbe
Then, since only /a/b/c/d/e/f-i:l;e used the special chars, then it will be printed.

Replies are listed 'Best First'.
Re: How to replace spaces with different chars?
by ikegami (Patriarch) on Jul 06, 2022 at 14:49 UTC
    my @special_chars = qw( : ; , = - ); my $special_chars_glob_alt = "{".( join ",", map quotemeta, @special_chars )."}"; my $corrupt_qfn = ...; my $glob = $corrupt_qfn =~ s{ ( [ ] ) | ( [^\w ] ) }{ defined( $1 ) ? $special_chars_glob_alt : "\\$2" }xegr; my @possible_qfns = glob( $glob ); say for @possible_qfns;

    or

    use Algorithm::Loops qw( NestedLoops ); my @special_chars = qw( : ; , = - ); my $corrupt_qfn = ...; my $iter = NestedLoops([ map { $_ eq " " ? \@special_chars : [ $_ ] } split( /( )/, $corrupt_qfn, -1 ) ]); while ( my @parts = $iter->() ) { my $possible_qfn = join( "", @parts ); say for $possible_qfn; }

    Four spaces results in 54 = 625 possible paths.

      s{ ( [ ] ) | ( [^\w ] ) }{ defined( $1 ) ? $special_chars_re_class : "\\$2" }egr;

      I think this needs the /x modifier to work as advertised.


      Give a man a fish:  <%-{-{-{-<

        Oops, fixed.

      The line at the bottom is hinting at the fact that this is not practical. What you could do instead of get a list of all the files, then use a regex match to find matching files.

      use File::Basename qw( fileparse basename ); my $special_chars_re_class = "[".( join "", map quotemeta, @special_chars )."]"; my ( $dir_qfn, $corrupt_fn ) = fileparse( $corrupt_qfn ); my $glob = quotemeta( $dir_qfn ) . '*'; my $re = $corrupt_fn =~ s{ ( [ ] ) | ( [^\w ] ) }{ defined( $1 ) ? $special_chars_re_class : "\\$2" }xegr; $re = qr/^\Q$dir_qfn\E$re\z/; while ( defined( my $qfn = glob( $glob ) ) ) { next if $qfn !~ $re; say $qfn; }

      As written, this assumes the spaces are just in the file name.

      This could easily be adapted to check for multiple files at once.

        I'm going to parrot Ikegami " this is not practical ".

        The real answer is to fix or replace the tool that gave you the incorrect output in the first place.

        -Scott

Re: How to replace spaces with different chars?
by LanX (Sage) on Jul 06, 2022 at 15:33 UTC
    here a partial solution using glob

    But I don't know how to escape the "," appropriately ...

    use v5.12; use warnings; use Data::Dump qw/pp dd/; my $globber = join ",", (":", ";", "=", "-"); for my $line (<DATA>) { chomp $line; say "--- $line"; $line =~ s/\s/{$globber}/g; say pp [ glob($line) ]; } __DATA__ /a/b/c/d/e/fi le /a/b/c/d/e/fi l e /a/b/c/d/e/f i l e /a/b/c/d/e/f i l e

    output

    ) one workaround may be to use a placeholder like "\0" which can't naturally appear and to replace it afterwards.

    update

    the whole concept is dubious, because this will create far too many combinations. (Besides solving the original issue) it's better to check incrementally from left to right with wildcards, before progressing to the next position.

    my @part = </a/b/c/d/e/f{;,:,=,-}i*>;

    etc

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: How to replace spaces with different chars?
by kcott (Archbishop) on Jul 07, 2022 at 08:13 UTC

    G'day ovedpo15,

    "How can I use the system instead?"

    With these paths:

    ken@titan ~/tmp/pm_11145308_test_dir $ ls -1 a/b/c/d/e 'f:i-:l;e' 'fi:l;e' 'f-i:l;e' 'f-i-:l;e' fi:le

    And this code (find_file_match.pl):

    #!/usr/bin/env perl use strict; use warnings; my @data = ( 'a/b/c/d/e/fi le', 'a/b/c/d/e/fi l e', 'a/b/c/d/e/f i l e', 'a/b/c/d/e/f i l e', ); for my $datum (@data) { print "\n*** Files matching '$datum':\n"; # $datum =~ y/ /?/; -- see update below $datum =~ s/ /[\\;:,-]/g; system("ls -1 $datum"); }

    You get this output:

    ken@titan ~/tmp/pm_11145308_test_dir $ ./find_file_match.pl *** Files matching 'a/b/c/d/e/fi le': a/b/c/d/e/fi:le *** Files matching 'a/b/c/d/e/fi l e': 'a/b/c/d/e/fi:l;e' *** Files matching 'a/b/c/d/e/f i l e': 'a/b/c/d/e/f-i:l;e' *** Files matching 'a/b/c/d/e/f i l e': 'a/b/c/d/e/f:i-:l;e' 'a/b/c/d/e/f-i-:l;e'

    Update: See ++LanX' valid comment regarding the inherent bug in the code I posted above. Changing y/ /?/ to s/ /[\\;:,-]/g fixes this.

    — Ken

      I had the same idea, it's far more efficient!

      But you still need a second step where you only grep those files matching [;:,-] at the missing spots, otherwise you will have false positives like fidlee

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        ++ Well spotted.

        If I add fidlee:

        $ ls -1 a/b/c/d/e 'f:i-:l;e' 'fi:l;e' 'f-i:l;e' 'f-i-:l;e' fi:le fidlee

        I get an additional line in the output:

        *** Files matching 'a/b/c/d/e/fi l e': 'a/b/c/d/e/fi:l;e' a/b/c/d/e/fidlee

        Changing y/ /?/ to s/ /[\\;:,-]/g fixes this. I've updated my post.

        — Ken

Re: How to replace spaces with different chars?
by AnomalousMonk (Archbishop) on Jul 07, 2022 at 02:33 UTC
    Given that list of paths and the array of special chars, I want to report all existing possible paths. Kind of brute-force .... ... replace each space with one of the chars ...

    Here's a brute-force approach to doing what you want, but I'm not sure it will necessarily help you solve your problem. (And permutations are applied to all whitespace, not just blanks.)

    Output:
    Win8 Strawberry 5.8.9.5 (32) Wed 07/06/2022 21:46:31 C:\@Work\Perl\monks\ovedpo15 >perl Odo.t ok 1 - use Odo; 1..8 ok 2 - "/a/b/c/d/e/fi le" ok 3 - " /a/b/c/d/e/file" ok 4 - "/a/b/c/d/e/file " ok 5 - "/a/b/c/d/e/fi l e" ok 6 - "/a/b/c/d/e/fi \tle " ok 7 - "\t/a/b/c/d/e/fi\nle\f" ok 8 - no warnings

    Update: The technique of defining iterators is described in detail in Dominus's excellent Higher Order Perl; see free HOP download.


    Give a man a fish:  <%-{-{-{-<

Re: How to replace spaces with different chars?
by kcott (Archbishop) on Jul 07, 2022 at 14:05 UTC

    In my initial response I answered your question about using system; however, it occurs to me that you probably want to capture that output and do something with it. So, if you replace

    system("ls -1 $datum");

    with

    my @matches = `ls -1 $datum`;

    you'll now have an array of matches that you can process further. If that processing was

    print for @matches;

    you'll get identical output to what I originally posted.

    — Ken

      why not use

      my @matches = glob($datum)

      or

      my @matches = <$datum>

      directly?

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        This was a follow-up to my previous post about system, which has:

        "This is not what you want to use to capture the output from a command; for that you should use merely backticks or qx//, ..."

        Of course, TMTOWTDI. :-)

        — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11145308]
Approved by philipbailey
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2022-12-04 11:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?