Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^4: Extract sequence of UC words?

by BrowserUk (Patriarch)
on Aug 18, 2008 at 17:13 UTC ( [id://704990]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Extract sequence of UC words?
in thread Extract sequence of UC words?

Somewhat simpler:

print "'$1'" while $data =~ m/(\b[A-Z][A-Z\s]+[A-Z]\b)/g;; 'THIS IS A SENTENCE' 'SEQUENCE OF UPPER WORDS'

or

print "'$1'" while $data =~ m/(\b[A-Z][A-Z\s]+[^ ]\b)/g;; 'THIS IS A SENTENCE' 'SEQUENCE OF UPPER WORDS'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^5: Extract sequence of UC words?
by monarch (Priest) on Aug 18, 2008 at 17:58 UTC
    The issue I have with your examples, BrowserUk, is that you are mandating at least 2 upper case letters. My regexp permits a single capital letter.

    I think it is important to have the optional section, because the desired expression is "one or more upper case letters" optionally followed by any number of "spaces followed by upper case letters".

      I upvoted your post above, but still your regex m/(\b(?:[A-Z]+(?:\s+[A-Z]+)*)+\b)/g made me squirm. Whenever I see sequences of nested quantifiers like that:+)*)+ I get uncomfortable, remembering various pathelogical cases I've constructed in the past.

      To that end, I thunk again, and came up with this which I believe meets the 'spec', whilst avoiding the nested quantifiers;

      m[ ( \b [A-Z] (?: [A-Z\s]* [A-Z] )? \b ) ]gx

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://704990]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-03-19 11:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found