http://qs321.pair.com?node_id=704933


in reply to Extract sequence of UC words?

It grabs only one word because you are matching for a sequence of upper case letters only. To have it match 'TEST SENTENCE' you'll have to have it match upper case letters OR spaces.

But wait! The regex will then actually match ' TEST SENTENCE ' (including the space before and after the capitalized sequence). So what you really need is to make a match of:

  1. One upper case letter
  2. Any number of upper case letters/spaces
  3. One upper case letter

The requirement to match a beginning and ending upper case letter will also make it not match just the 'F' of 'Foo'.

Edit: gaal is smarter than I, heh.

Replies are listed 'Best First'.
Re^2: Extract sequence of UC words?
by Anonymous Monk on Aug 18, 2008 at 14:04 UTC
    Thanks, I modified it like so:
    /([A-Z\|\s+]+)+/
      | and + inside a character class aren't special, they're just regular characters, so your regex would match "FO O|B+++A R". /[A-Z ]+/ (which is what I think you probably meant) won't work either.

      Bonus points will be given if you tell us why!

      Update: BrowserUK has already seen what was missing. You missed out AnonyMonk

      Note that the regex expression  [A-Z\|\s+] defines a set of characters that includes the '|' ('pipe') character. Within a character set, the pipe has no special meaning; i.e., it is not the regex alternation metacharacter.