Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

is there an easy way to dumb down this regular expression for me?

by moltar512 (Sexton)
on Oct 19, 2005 at 20:20 UTC ( [id://501406]=perlquestion: print w/replies, xml ) Need Help??

moltar512 has asked for the wisdom of the Perl Monks concerning the following question:

i was given the code:
$A = -4; $w = 500; then $re = '/^\s*(\Q$A\E\s*\*?\s*\Qcos\E\s+\Q$w\E\s*\Qt\E\s+\QA\E||\Q$A\E\s*\*?\s +*\Qcos\E\s*\(\s*\Q$w\E\s*\Qt\E\s*\)\s+\QA\E)\s*$/'; will accept answers in the following form: -4cos 500t A -4 cos 500t A -4*cos 500t A -4*cos(500t) A -- and any number of spaces before and after *, t, ( +and )
It works great.. but that regular expression.. Jesus Christ.. Is there a way to dumb that down for me so that i can apply it to other situations, or would that take a complete O'Reilly book to figure out?

Replies are listed 'Best First'.
Re: is there an easy way to dumb down this regular expression for me?
by japhy (Canon) on Oct 19, 2005 at 20:26 UTC
    It also accepts an empty string, since you've got a doubled '|' in there. One | separates alternatives, not two.

    As for dumbing it down, I'd rewrite it like so:

    $A_rx = qr{ \Q $A }x; $w_rx = qr{ \Q $w }x; $rx = qr{ $A_rx \s* \*? \s* cos \s* (?: \( \s* $w_rx \s* \*? \s* t \s* \) | $w_rx \s* \*? \s* t \s ) \s* A }x;
    Then you can use it in the following manner: $string =~ /^\s*($rx)\s*$/.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart

      A nit - you'll notice that $re is assigned '/.../', not a qr expression. I prefer your assignment of the qr// but its worth noting.

      thanks! but i have to ask a few more questions yours is infintely more easy to read.. but now i'm trying to figure out what you did using http://www.perlmonks.net/?node=perlop what does the x behind all of your qr{ ... }x mean?
        Look at perlre as well for the /x modifier:
        x Extend your pattern's legibility by permitting whitespace and comments.
        It was used so that everything could be spaced out for clarity -- that's also why al the \s* are in there because when using /x you have to explicitly say when you actually want to match whitespace.
Re: is there an easy way to dumb down this regular expression for me?
by diotalevi (Canon) on Oct 19, 2005 at 20:41 UTC

    You had a bug. The alternation || is only written with *one* pipe character. The zero length pattern between each of the |s is also available so you could match anything that /^\s*$/ would match. The \Q...\E means quotemeta(). I've removed the \Q$A\E by doing quotemeta outside teh regexp. Since quotemeta has no effect on stuff like t and A, I removed it from those as well. I factored out the common parts of the expression so they aren't repeated. I also used the /x flag so I could add in as much extra whitespace as a I want - makes it easier to read. You'll notice how now everything except one branch of the alternation is followed by optional whitespace. That's to prevent "tA" from matching.

    $A = quotemeta -4; $w = quotemeta 500; $re = '/^ \s* $A \s* -? \s* cos \s* (?: \( \d+ t \) | \d+ t \s ) \s* A \s* $/x;
Re: is there an easy way to dumb down this regular expression for me?
by GrandFather (Saint) on Oct 20, 2005 at 00:28 UTC

    Fixing the || bug, refactoring the expression and commenting gives this:

    use strict; use warnings; my $A = -4; my $w = 500; my $re = qr / ^\s* #Start of line followed by any amount of white space \Q$A\E #Litteral contents of $A \s* #Any amount of white space \*? #Optional * \s*cos #Any amount of white space followed by cos \s* #Any amount of white space ( #Start a capture group \s #At least one white space character \Q$w\E #Litteral contents of $w \s*t #Any amount of white space followed by t | #Match either expression (i.e. - with or without brackets) \(\s* #( followed by any amount of white space \Q$w\E #Litteral contents of $w \s*t #Any amount of white space followed by t \s*\) #Any amount of white space followed by ) ) #End of capture group \s+A #At least one white space character followed by A \s*$ /x; while (<DATA>) { print "Match: $_" if $_ =~ $re; } __DATA__ -4cos 500t A -4 cos 500t A -4*cos 500t A -4*cos(500t) A

    Perl is Huffman encoded by design.
Re: is there an easy way to dumb down this regular expression for me?
by ioannis (Abbot) on Oct 20, 2005 at 03:15 UTC
    Sometimes, it is more prudent to declare a pattern with our .

    Thanks to GrandFather who styled and commented the whole mess. The pattern now takes a huge space on my screen and the Perl code is mixed with typical ugly regexes. So, how do I reuse the pattern in another module; simple you say, declare it with our so the pattern is accessable from everywhere. Could we do better and avoid the our declaration? Yes, you say, declare a function that returns the regex, and all we have to do later is check for match using $_ =~ cos_pattern() . Lets declare the function:

     sub cos_pattern { $re }

    You see the problem? We have inadvertently created a closure.

    And how do we solve this problem: back declaring with our again:

    our $re = qr / ....big... ...multiline... ...pattern... /x;
Re: is there an easy way to dumb down this regular expression for me?
by Anonymous Monk on Oct 21, 2005 at 14:56 UTC
    YAPE::Regex::Explain might help
    The regular expression: (?-imsx:^\s*(\Q$A\E\s*\*?\s*\Qcos\E\s+\Q$w\E\s*\Qt\E\s+\QA\E||\Q$A\E\s +*\*?\s*\Qcos\E\s*\(\s*\Q$w\E\s*\Qt\E\s*\)\s+\QA\E)\s*$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- A 'A' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \*? '*' (optional (matching the most amount possible)) ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- cos 'cos' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- w 'w' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- t 't' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- A 'A' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- A 'A' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \*? '*' (optional (matching the most amount possible)) ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- cos 'cos' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \( '(' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- w 'w' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- t 't' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \) ')' ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \Q 'Q' ---------------------------------------------------------------------- A 'A' ---------------------------------------------------------------------- \E 'E' ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://501406]
Approved by sauoq
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-25 11:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found