![]() |
|
The stupid question is the question not asked | |
PerlMonks |
comment on |
( #3333=superdoc: print w/replies, xml ) | Need Help?? |
Perhaps you mean "em dash" instead of "en dash"?
This is called "em" because it is similar to the with of "M" in a variable width font. An en dash is shorter, like the width of the letter "n" In any event, you will have to be reading using UTF-8 encoding. My dev environment for Perl only can do ASCII. I cannot easily write code for this.
As far as regex goes: The question is what "em_dash" should be and how that relates to how the data decoding that was used during the read.
update: under some coding scenarios an em dash is \x{2014}.
Some Monks here are quite experienced with utf8 encoding. In reply to Re^3: Parsing/regex help required
by Marshall
|
|