Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Escaping multiple escape chars

by JamesNC (Chaplain)
on Dec 09, 2005 at 20:17 UTC ( [id://515642]=perlquestion: print w/replies, xml ) Need Help??

JamesNC has asked for the wisdom of the Perl Monks concerning the following question:

I am on Win32, AS 5.8.7.
I am ran into a character substitution problem, any enlightment would be great. I built a pure perl Pdf module and was testing my tables with some ascii art. I have to escape every '\' with '\\' in the string I embed in the Pdf. This code does work, but only if the '\' is not followed by another '\':
$txt = ' \*/ '; $txt =~s/\\/\\\\/g; # what I expected ' \\*/ '; print $txt;
but, it does not work when there this 2 or more
$txt = ' __\\U//__ '; $txt =~s/\\/\\\\/g; print $txt; #prints __\\U//__
I needed it to do this '__\\\\U//__'
Can anyone show me how to do this?
Thanks,
JamesNC
This is kindof solved, thanks to everyone who responded.
Update: I posted a follow-up. I would still like some more info on why ikegami's suggestion of streaming the data makes a difference.
Closing note:
This is a "special helpful programming feature" of perl that backslashes are interpreted as backslashes inside of single quotes except in the case where a backslash is followed by another backslash, in which case the pair escape each other. It is documented in perlop as was pointed out to me by ambrus. All this because I started messing with ascii art.

Replies are listed 'Best First'.
Re: Escaping multiple escape chars
by ikegami (Patriarch) on Dec 09, 2005 at 20:54 UTC

    In single quote string literals, "\\" is interpreted as "\", "\'" is interpreted as "'", and everything else is left as is.

    $txt = ' __\\U//__ '; print("$txt\n"); # Prints __\U//__ $txt = ' __\\\\U//__ '; print("$txt\n"); # Prints __\\U//__ $txt =~s/\\/\\\\/g; print("$txt\n"); # Prints __\\\\U//__

    That only applies to string literals. If you had been reading from a file, your code would have worked:

    $txt = <DATA>; print("$txt\n"); # Prints __\\U//__ $txt =~s/\\/\\\\/g; print("$txt\n"); # Prints __\\\\U//__ __DATA__ __\\U//__
      At first, I thought I understood this, but this still is a little confusing to me. How does this work, and why doesn't using '' -vs- "" make a difference? I am intrigued and would like to understand this better. Is there a way to make perl think the data came from a stream?
      Thanks,
      James

        Why would ' vs " make a difference? \\ is used as an escape character for both.

        In single quote string literals, "\\" is interpreted as "\", "\'" is interpreted as "'", and everything else is left as is.

        In double quoted string literals, "\\" is interpreted as "\", "\"" is interpreted as """, ...

        Is there a way to make perl think the data came from a stream?

        No. That makes no sense. The script doesn't know whether the data came from a stream or not. The conversion happens during the parsing of the script by the Perl parser. If it didn't do this, you would have no way of specifying certain strings, such as the one containing solely \.

        Why would ' vs " make a difference? \\ is used as an escape character for both.

        In single quote string literals, "\\" is interpreted as "\", "\'" is interpreted as "'", and everything else is left as is.

        In double quoted string literals, "\\" is interpreted as "\", "\"" is interpreted as """, ...

        Is there a way to make perl think the data came from a stream?

        No. That makes no sense. The script doesn't know whether the data came from a stream or not. The conversion happens during the parsing of the script by the Perl parser. If it didn't do this, you would have no way of specifying certain strings, such as the one containing solely \.

      Doh! Thanks ikegami! I will load the art into files.
Re: Escaping multiple escape chars
by Roy Johnson (Monsignor) on Dec 09, 2005 at 20:42 UTC
    Backslash is a special character. When you assign
    $txt = ' __\\U//__ ';
    , the first backslash is escaping the second, so the string has only one backslash in it. Print it and you'll see. When something other than a backslash or quote character appears after the backslash, the backslash isn't seen as an escape, and is interpreted literally. So your
    $txt = ' \*/ ';
    also has one backslash.

    Backslash is also special in the replacement side of the s///, so your four become two. To convert one to four, double the number of backslashes on the right.


    Caution: Contents may have been coded under pressure.
Re: Escaping multiple escape chars
by jdporter (Paladin) on Dec 09, 2005 at 20:23 UTC

    Have you tried the quotemeta function?

    We're building the house of the future together.
      I just tried it. It has the same behavior. Thanks anyway. If I split //, '\\|//'; or split //, '\|//'; I even tried to split the characters and repack them, but split give me 3 characters for both expressions.
Re: Escaping multiple escape chars
by l.frankline (Hermit) on Dec 10, 2005 at 04:45 UTC

    If you are working in regular expressions, non-alphanumeric characters are recognised
    only if they are preceded by a backslash (\) else it will be treated as:

    • perl operators or
    • popups an error or
    • you will get undesired results.

    Lets take a look at your problem.

    \ (backslash is a non-alphanumeric character).

    \\ (in regular expression, two backslashs are recognised as single backslash)

    while trying the below segment...

    $txt=~s/\\/\\\\/g;

    \\\\ (FOUR backslashs becomes 2 backslashs)

    therefore your results will be...

    __\\U//__

    For your desired output try the below one:

    $txt = ' __\\U//__ '; $txt =~s/\\/\\\\\\\\/g; print $txt;

    regards
    Franklin

    Don't put off till tomorrow, what you can do today.

      1.frankline unfortunately, your substition gives an incorrect sub for the case \*/, which also appears in the ascii art because it produces an extra '\' where there should be only 1
      Here is the art btw:
      my $xmas = q% __,_,_,___) _______ + (--| | | (--/ ),_) ,_) + | | | _ ,_,_ | |_ ,_ ' , _|_,_,_, _ , + __| | | (/_| | (_| | | || |/_)_| | | |(_|/_)___, + ( |___, ,__| \____) |__, |__, + + | _...._ + \ _ / .::o:::::. + (\o/) .:::'''':o:. + --- / \ --- :o:_ _::: + >*< `:}_>()<_{:' + >0<@< @ `'//\\'` @ + >>>@<<* @ # // \\ # @ + >@>*<0<<< __#_#____/'____'\____#_#__ + >*>>@<<<@<< [__________________________ +] >@>>0<<<*<<@< |=_- .-/\ /\ /\ /\--. =_-| + >*>>0<<@<<<@<<< |-_= | \ \\ \\ \\ \ |-_=-| + >@>>*<<@<>*<<0<*< |_=-=| / // // // / |_=-_| + \*/ >0>>*<<@<>0><<*<@<< |=_- |`-'`-'`-'`-' |=_=-| + ___\\U//___ >*>>@><0<<*>>@><*<0<< | =_-| o o |_==_| + |\\ | | \\| >@>>0<*<<0>>@<<0<<<*<@< |=_- | ! ( ! |=-_=| + | \\| | _(UU)_ >((*))_>0><*<0><@<<<0<*< _|-,-=| ! ). ! |-_-=| +_ |\ \| || / //||.*.*.*.|>>@<<*<<@>><0<<@</=-((=_| ! __(:')__ ! |=_==_ +-\ |\\_|_|&&_// ||*.*.*.*|_\\db//__ (\_/)-=))-|/^\=^=^^=^=/^\| _=-_ +-_\ """"|'.'.'.|~~|.*.*.*| ____|_ =('.')=// ,------------. + jgs |'.'.'.| ^^^^^^|____|>>>>>>| ( ~~~ )/ (((((((()))))))) + ~~~~~~~~ '""""`------' `w---w` `------------' + %;
      I solved the problem by using the suggestion of moving the art into __DATA__ using $xmas = <DATA>; The follow on posts were because I think that this behavior is broken for the case of single versus double quoted strings, but thanks for trying.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://515642]
Approved by jdporter
Front-paged by Roy Johnson
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2024-04-26 01:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found