Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^4: regex search for words with one digit

by tybalt89 (Prior)
on Sep 21, 2020 at 23:37 UTC ( #11122050=note: print w/replies, xml ) Need Help??


in reply to Re^3: regex search for words with one digit (updated)
in thread regex search for words with one digit

Once you add in the restriction of "only one digit", the \b is required.
My

my $text = "John P5ete Andrew Richard58 Nic4k Le7on5"; my @names = $text =~ /\b[^\W\d]*\d[^\W\d]*\b/g; print "@names\n";

outputs

P5ete Nic4k

but without the \b's

my $text = "John P5ete Andrew Richard58 Nic4k Le7on5"; my @names = $text =~ /[^\W\d]*\d[^\W\d]*/g; print "@names\n";

it outputs

P5ete Richard5 8 Nic4k Le7on 5

It's pulling patterns out of the middle of "words".

Replies are listed 'Best First'.
Re^5: regex search for words with one digit
by AnomalousMonk (Bishop) on Sep 23, 2020 at 04:47 UTC

    Ah — I completely missed this point! Maybe a bit of occasional cargo-culting isn't entirely bad. :)

    I still think I would use (?<! \S) (?! \S) as boundary assertions instead of \b though, to properly handle "words" like 'a1-b1'. Anon gives no hint that such things may appear in the data, but an abundance of caution inclines me this way. More reflex defensiveness I guess.


    Give a man a fish:  <%-{-{-{-<

      Anon gives no hint that such things may appear in the data

      Furthermore, Anon gives no hint whether they should be treated as one word or two if they did. It's great to have both tools in the box but blindly applying one or other in the absence of a relevant spec becomes a guessing game.


      🦛

Re^5: regex search for words with one digit
by Bruder Savigny (Initiate) on Sep 24, 2020 at 00:56 UTC

    Very instructive! I hadn't looked closely at your code, embarrassingly, and this actually happens to be the first example I can remember where a \b is actually required. And I really can't think up an alternative without it. (It's not that I somehow don't like it, just that I've never had to use it.)

    On reflection, I would perhaps have used split and then something with grep {/^...$/} -- much clumsier and also amounting to the same thing in disguise, namely string anchors. I never realised that \b are a kind of ^ and $, but within the text.

    (The point I was trying to make -- only half convincingly -- was that I think it's a good habit not to throw in things "for good measure" and move on as soon as it works, because that approach doesn't teach you a lot, and you sometimes walk away with a wrong or (worse!) fuzzy impression of what actually did the trick. Or both. I have experienced this a thousand times.)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11122050]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2020-10-29 11:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (270 votes). Check out past polls.

    Notices?