Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^5: Escaping multiple escape chars

by ikegami (Patriarch)
on Dec 10, 2005 at 14:22 UTC ( [id://515724]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Escaping multiple escape chars
in thread Escaping multiple escape chars

' and " are different. All of this is documented.
A literal should mean, I am to be taken literally

Nonsense. Literals are always interpreted. The literal 1234 in the expression $a = 1234; is a group of characters, yet they are interpreted as a number. And somehome, the quotes in the literal "abc" are removed.

Without interpretation, the compiler wouldn't know where the string ends and/or wouldn't allow certain characters to be part of the string. For example, what if there was a quote in your ASCII art? How would Perl know the string doesn't end at that quote, but rather the following one? (or the one after that?) For single quoted strings, you preceed the quote with a backslash. Of course, now we need a method of allowing backslashes followed by single quotes...

Whenever something is embedded in something else, be it a string in a Perl source file, a object in a data file or text between HTML tags, some form of encoding or escaping is required. To be crystal clear: You can't have strings and Perl code in the same file without some form of escaping or encoding.

Replies are listed 'Best First'.
Re^6: Escaping multiple escape chars
by JamesNC (Chaplain) on Dec 10, 2005 at 16:29 UTC
    I have attempted to explain my question with 2 examples using 'some\text' and "some\text" both of which get stored as I understood they should. However, 'some\\text' and "some\\text" don't behave as you would expect and are NOT stored with the same rules as the first case because in the first case, the compiler sees the single quotes and stores the characters correctly by pushing an additional '\' to the PV array as it should. What is going wrong, however, is the case when it sees a second '\' in the literal case, and doesn't push another '\' onto the PV.
    use Devel::Peek; # literal case - PV's are the same $txt = 'some\text'; print "Case 1 'some\\text': $txt\n-------\n"; # PV gets an extra '\' pushed on it. Dump($txt); $txt = 'some\\text'; print "\nCase 2 'some\\\\text: $txt\n-------\n"; #PV does NOT get an extra '\' pushed on to it Dump($txt); # quoted case - PV's handle '\' as expected $txt = "some\text"; print "\nCase 3 \"some\\text\": $txt\n-------\n"; Dump($txt); # $txt = "some\\text"; print "\nCase 4 \"some\\\\text\": $txt\n-------\n"; Dump($txt);
    If you actually try these examples, then you can see for yourself how the scalars are getting stored.

    Nuff said. I will ask the perl porters from here.
      However, 'some\\text' and "some\\text" don't behave as you would expect

      I got the following (abbreiviated) results:

      Case 1 'some\text': some\text ------- PV = 0x1aa01f0 "some\\text"\0 Case 2 'some\\text: some\text ------- PV = 0x1aa01f0 "some\\text"\0 Case 3 "some\text": some ext ------- PV = 0x1aa01f0 "some\text"\0 Case 4 "some\\text": some\text ------- PV = 0x1aa01f0 "some\\text"\0

      If you obtained the above, perl behaved as expected (or at least as documented). According to to the relevant docs,

      double-quoted string literals are subject to backslash and variable substitution; single-quoted strings are not (except for \' and \\).

      In single quotes, perl interprets the backslash literally (instead of an escape) unless followed by another backslash or by a single quote. This DWIM most of the time, since few characters ever need to be escaped in single-quoted strings. For example, '\n' results in the string {backslash}+{lowercase n}.

      In double quotes, perl interprets the backslash and the following character as an escape (instead of literally). How perl interprets an invalid/unrecognized escape sequence is undocumented (or unclearly documented), but the behaviour has consistently been to replace the escape sequence with the character that followed the backslash. This DWIM most of the time, since the extra backslash was probably a case of overzealous escaping. For example, "\<" results in the string {right angle bracket}.

      If you don't see the necessity of the equivalency of
      "some\\text"
      and
      'some\\text'
      try answering the following questions:

      • How would you code the string {backslash}+{backslash} using single quotes?
      • How would you code the string {backslash}+{backslash} using double quotes?
      • How would you code the string {backslash}+{backslash}+{single quote} using single quotes?
      • How would you code the string {backslash}+{backslash}+{double quote} using double quotes?
      • How would you code the string {backslash}+{single quote} using single quotes?
      • How would you code the string {backslash}+{double quote} using double quotes?
        You make an excellent point(s) and I really appreciate your lengthy discussion of this with me. I think I have learned a very important lesson lesson (doesn't always DWIM inside of '').
        How would I code it? I don't think for 1 second that I could do a better job than those who faced this problem. But, I would love to try. I would have to see how it is done now. I know in coding, sometimes you paint yourself in a corner and for architecture reasons these things happen as artifacts of the design, and it cannot be changed.
        Thanks soooo much for discussing this with me.
        Regards,
        JamesNC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://515724]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-19 02:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found