Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Extract a string from the line

by Tux (Canon)
on May 29, 2021 at 09:30 UTC ( [id://11133253]=note: print w/replies, xml ) Need Help??


in reply to Extract a string from the line

What LanX said. Do not do it with regex.

Your regex is wrong (besides what LanX noted) in that it is extremely unsafe and misses some restrictions.

my @numbers = m{ <stlib:membit # Opening tag [^.]* # Optional attributes > # End of opening tag \s* # Optional whitespace ([0-9]+) # The number you want \s* # Optional whitespace </stlib:membit> # Closing tag }gx; # I want all of them in this line

Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: Extract a string from the line
by LanX (Saint) on May 29, 2021 at 10:19 UTC
    > What LanX said. Do not do it with regex.

    Yeah, but for clarification, I said "in most cases". :)

    Sometimes the XML is just so static and restricted that using a full parser would be overkill.

    pdftohtml -xml is one example for that.

    PS: if you want to allow optional whitespace, you might also want to add an /s modifier to match newlines too.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11133253]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-26 02:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found