Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^3: Parsing/regex help required

by Marshall (Canon)
on Sep 28, 2021 at 02:07 UTC ( [id://11137077]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Parsing/regex help required
in thread Parsing/regex help required

Perhaps you mean "em dash" instead of "en dash"?
This is called "em" because it is similar to the with of "M" in a variable width font.
An en dash is shorter, like the width of the letter "n"

In any event, you will have to be reading using UTF-8 encoding. My dev environment for Perl only can do ASCII. I cannot easily write code for this.

As far as regex goes:
You need to group an or'd expression something like this (-|em_dash)
To make it "non capturing", (?:-|em_dash);

The question is what "em_dash" should be and how that relates to how the data decoding that was used during the read.

update: under some coding scenarios an em dash is \x{2014}.
I think you need "use utf8;" for that to work, but I am not sure.

Some Monks here are quite experienced with utf8 encoding.
Bring it on!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137077]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (None)
    As of 2024-04-25 03:56 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found