http://qs321.pair.com?node_id=1225551


in reply to Re^2: Pattern matching
in thread Pattern matching

Hi nursyza,

I see that parv already provided you with an explanation of the regex pattern for you. I wanted to let you know that you can use the YAPE::Regex::Explain module to provide an explanation of any regular expression pattern. Once you have the package installed you can do something like this at the command line to get the explanation for your pattern

perl -MYAPE::Regex::Explain -E 'say YAPE::Regex::Explain->new("\b (MOD +ULE \s+ [A-Z]+[0-9]+) \s* [(] .+? [)]")->explain'
Which give the following output:
The regular expression: (?-imsx: (MODULE s+ [A-Z]+[0-9]+) s* [(] .+? [)]) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- MODULE 'MODULE ' ---------------------------------------------------------------------- s+ 's' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- [A-Z]+ any character of: 'A' to 'Z' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [0-9]+ any character of: '0' to '9' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- s* 's' (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- [(] any character of: '(' ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- .+? any character except \n (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ' ' ---------------------------------------------------------------------- [)] any character of: ')' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

You may also want to look at perlre to get more familiar with regular expressions.

UPDATE: As parv, soonix, and AnomalousMonk pointed out (in the replies to this node), the above usage of YAPE::Regex::Explain is not correct. Passing the regex as a double-quoted string caused problems.

The following code gives the correct output

#!/usr/bin/env perl use strict; use warnings; use YAPE::Regex::Explain; my $re = qr/ \b (MODULE \s+ [A-Z]+[0-9]+) \s* [(] .+? [)] /x; my $exp = YAPE::Regex::Explain->new($re)->explain; print $exp; exit;
Here is the output
The regular expression: (?x-ims: \b (MODULE \s+ [A-Z]+[0-9]+) \s* [(] .+? [)] ) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?x-ims: group, but do not capture (disregarding whitespace and comments) (case-sensitive) (with ^ and $ matching normally) (with . not matching \n): ---------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- MODULE 'MODULE' ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [A-Z]+ any character of: 'A' to 'Z' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [0-9]+ any character of: '0' to '9' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- [(] any character of: '(' ---------------------------------------------------------------------- .+? any character except \n (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- [)] any character of: ')' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------