http://qs321.pair.com?node_id=1211954

skooma has asked for the wisdom of the Perl Monks concerning the following question:

The pattern example is, {aa_bb_cc_dd_... (characters with underscores, one or more )} (or) {a1_11_1c_1... (alphanumeric characters with underscores, one or more)} (or) {aaa... (just characters, one or more)}

Replies are listed 'Best First'.
Re: Looking for a regular expression for the following pattern.
by ikegami (Patriarch) on Mar 29, 2018 at 12:41 UTC
    [0-9_]*[a-zA-Z][a-zA-Z0-9_]*

    So to check if an entire string matches one of your sequences,

    if ($str =~ /^[0-9_]*[a-zA-Z][a-zA-Z0-9_]*\z/) { ... }

      ++ for finally making me clear what's \z is good for…


      s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
      +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Looking for a regular expression for the following pattern.
by Skeeve (Parson) on Mar 29, 2018 at 06:37 UTC

    So is it alway an underscore after 2 characters?

    Otherwise this should do:

    /^[a-z0-9_]+$/i

    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

      That matches "_" and "a\n"

        1. OP didn't say anything about linefeeds so it's safe to assume they can be ignored
        2. OP didn't say that "_" is not allowed. It's "alphanumeric characters with underscores, one or more"
        3. OP's post is poorly written so expect a poorly written reply ;)

        s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
        +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Looking for a regular expression for the following pattern.
by Marshall (Canon) on Mar 29, 2018 at 09:57 UTC
    I don't understand your question. Please give more context information about what to match or not match.
    In Perl regex lingo \w+ matches a-zA-Z0-9_

    Give an example of input data and what you expect to match.

    Update: I looked back at this thread, but I have yet to see a clear problem statement.
    Show an example of the "real world" data and what you expect to match from that.
    Writing regex'es out of context just doesn't make any sense.

Re: Looking for a regular expression for the following pattern.
by LanX (Saint) on Mar 29, 2018 at 10:28 UTC
Re: Looking for a regular expression for the following pattern.
by hdb (Monsignor) on Mar 29, 2018 at 07:54 UTC

    Groups of at least two alphanumeric characters separated by underscores:

    /^(([a-zA-Z0-9]+)_)*[a-zA-Z0-9]+$/

      That matches "1" and "a\n". It doesn't match "a_".

        Next try...

        /^(([a-zA-Z0-9]{2,}_)*[a-zA-Z0-9]{2,})$/ && print "$_\n" for "1", "a\n", "aa\n", "a_", "aa", "aa_bb_cc_dd", "a1_11_1c_11", +"aaa", "aa_";

        Still matches a trailing newline. It does not match "aa_" on purpose as this is not a sequence of two or more alphanumeric characters separated by underscores which was my claim if not the original question.

Re: Looking for a regular expression for the following pattern.
by AnomalousMonk (Archbishop) on Mar 29, 2018 at 14:33 UTC

    I, also, don't really understand the requirements | vaguely-stated requirements of the OP. The way I'd like to see a question like this posted (and a way that'd be more likely to get useful responses IMHO) is something like:

    Of course, one would omit the test regex or use a dummy in an initial query post.


    Give a man a fish:  <%-{-{-{-<

Re: Looking for a regular expression for the following pattern.
by Anonymous Monk on Mar 29, 2018 at 14:48 UTC
    Well, if the three alternatives are a bit tricky to reconcile, why not simply use three regexes and perhaps an if statement?

      No number of regexes and/or if-clauses will reconcile a requirement statement that doesn't state what is required.


      Give a man a fish:  <%-{-{-{-<

        The first two requirements sound like the same thing: /([:alnum:]+[_]+/. But the third requirement is a subset of the first, consuming the alnum sequence with no need for underscore. Two regexes joined by || logical-or would do it. However, it just might be as simple as replacing the last + in the above regex with * to indicate "zero or more" underscores. The pattern now looks for one-or-more alnum characters followed by zero-or-more underscores.