Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Escaping special characters in Password prompt to be passed onto SAP-CRM

by TomDLux (Vicar)
on Jun 23, 2016 at 19:03 UTC ( [id://1166406] : note . print w/replies, xml ) Need Help??

in reply to Escaping special characters in Password prompt to be passed onto SAP-CRM

This is a regular expression to perform a match, so you aren't modifying anything, not \escaping anything, you're just detecting the presence of punctuation characters.

Your regex is equivalent to / A | B | C /xg ( g-flag allows spacing for readability which is ignored by regex. ) ... a long list of alternatives. Alternative A is [\~]+, B is [\!]+, etc.

The g flag you use makes the regex be used over and over, to detect the first matching character, the second matching character, ... until the end of the string is reached. If we drop it, we can consider what happens in a single pass, namely, the regex matches some alternative A, B, C, .... (except that the g flag returns the character(s) that matched, while without we only get the number of characters.)

Alternative A is a sequence of one or more occurences of the character class [\~]. The primary purpose of a character set is to denote multiple characters which can be accepted at a certain point. For example, Perl provides a built-in character class to accept numerals, \d, but in other languages, such as a shell scipt, you might use the character class [ 0123456789], abbreviated [ 0-9]. A character class can also be used to strip away "magic powers" of a regex meta-character such as '+' or '* or '.' ... [+] or [*] or [.]. You could also escape these meta-characters for the same effect, but some people are allergic to back-slashes.

So you are detecting multiple occurences of the character class contining the single punctuation character '\~' ... but escaping doesn't do anything here, so you are detecting /~/. You detect one of those alternatives, and the regex ends, keeping track of where the match occured. Then the g-flag match the regex run again, to start looking through the remainder of the string, and detect a second character, and so on, and so on. Having the + sign after each character class detects multiple side-by-side occurrences of the same character as a single match. Without the + signs, they would be detected as individual cgharacters.

What you really want is to detect any occurance of the character class [-~!@#$%^&*()_+=`[]\{}|;':",./<>?] ... having the '-' as the first character strips it of magic powers. But really you want to detect all characters other than ordinary letters and numbers. /[^\sa-zA-Z0-9]+/g will do that for you.

You want to escape those punctuation characters, so you want a search-and-replace regex...

my $password = fetch_user_password; $password =~ s/([^\sa-zA-Z0-9])/\\\1/g

It searchs for a march in the first section, and replaces each match according to the second section. First it looks for any charact in the character class EXCEPT ( because of ^ as first character ) a space character (\s), a lower case letter (a-z) or upper case letter (A-Z) or a digit (0-9). If it finds a matching character, it saves it, because of the wrapping parentheses () and goes on to the 'replace' phase. Here, we replace each match with a literal backslash (\\), followed by the text that matched the first parenthesized match. We could have several sets of thoings we grab, but in this case there's only the one.

Hope that helps

As Occam said: Entia non sunt multiplicanda praeter necessitatem.