Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Regex to pull out string within parenthesis that could contain parenthesis (updated)

by AnomalousMonk (Archbishop)
on Jul 09, 2018 at 17:58 UTC ( [id://1218175]=note: print w/replies, xml ) Need Help??


in reply to Regex to pull out string within parenthesis that could contain parenthesis

Here's another, more factored example of the use of recursive subpatterns (introduced with Perl version 5.10):

c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "use 5.010; ;; my $s = 'function convert(beg string, end string, read_date date, step char +(6)) returns (date, date, char(1))'; ;; my $rx_paren = qr{ ( [(] (?: [^()]*+ | (?-1))* [)] ) }xms; my $rx_identifier = qr{ \w+ }xms; ;; my $parsed_ok = my @ra = $s =~ m{ \A \s* (private|public)? \s* (function|report) \s* ($rx_identifier) \s* $rx_paren \s* ((returns) \s* $rx_paren)? \s* \z }xms; ;; if ($parsed_ok) { dd @ra; } else { print 'parse failed'; } " ( undef, "function", "convert", "(beg string, end string, read_date date, step char(6))", "returns (date, date, char(1))", "returns", "(date, date, char(1))", )

Update: The  (private|public)? \s* sub-expression in the above  m// should probably be something like (untested)
    ((?: private | public) \s)? \s*
because, e.g.,  public looks too much like  function or  report that would always follow it and requires some delimitation.


Give a man a fish:  <%-{-{-{-<

  • Comment on Re: Regex to pull out string within parenthesis that could contain parenthesis (updated)
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Regex to pull out string within parenthesis that could contain parenthesis
by TheDamian (Vicar) on Jul 09, 2018 at 21:41 UTC
    Here's a variation on the above solution, using named recursive subpatterns and named captures.
    Nowadays I write all my non-trivial regexes this way.
    use 5.010; my $source = 'function convert(beg string, end string, read_date date, + step char(6)) returns (date, date, char(1))'; my $matched = $source =~ m{ \A \s*+ (?<access> private | public )?+ \s*+ (?<keyword> function | report ) \s*+ (?<name> (?&identifier) ) \s*+ (?<params> (?&list) ) \s*+ (returns \s*+ (?<returns> (?&list) ) )?+ \s*+ \z (?(DEFINE) (?<identifier> [^\W\d]\w*+ ) (?<list> [(] [^()]*+ (?: (?&list) [^()]*+ )*+ [)] ) ) }xms; if ($matched) { my %components = %+; use Data::Dumper 'Dumper'; say Dumper \%components; } else { say 'parse failed'; }
    which outputs:
    $VAR1 = { keyword => 'function', name => 'convert', params => '(beg string, end string, read_date date, step char(6))', returns => '(date, date, char(1))', };

      I had entirely forgotten about named captures and  (?(DEFINE)...) — a much better (regex) approach.


      Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1218175]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-25 13:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found