Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re^3: Parsing/regex help required

by Fletch (Chancellor)
on Sep 27, 2021 at 19:50 UTC ( #11137066=note: print w/replies, xml ) Need Help??

in reply to Re^2: Parsing/regex help required
in thread Parsing/regex help required

Problem is your dash is a fancy unicode-y en dash, not just a simple "-" character so my na´ve attempt's not matching. I had to do some monkeying with Encode cutting and pasting your sample (which I don't think you'd need for Mojo when you're actually fetching your real results) but then I was able to get this to match.

## I set $_ to your sample string cut-n-pasted, then ran it through +decode DB<33> $_ = Encode::decode( q{UTF-8}, $_ ) ## Afterwards this worked (U+2013 is EN DASH); if you're not interes +ted in what ## the separator was you can of course change that bit to non-captur +ing DB<38> x m{ ^ (\d+) \. \s+ (.*?) \s+(-|\N{EN DASH}|\N{EM DASH})\s+ ( +.*?) $}x 0 123 1 'The Quick brown fox' 2 '\x{2013}' 3 'jumped over'

The cake is a lie.
The cake is a lie.
The cake is a lie.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137066]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (1)
As of 2021-12-02 01:48 GMT
Find Nodes?
    Voting Booth?
    R or B?

    Results (16 votes). Check out past polls.