Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: What regex can match arbitrary-length quoted string?

by QM (Parson)
on Sep 29, 2017 at 09:48 UTC ( [id://1200367]=note: print w/replies, xml ) Need Help??


in reply to What regex can match arbitrary-length quoted string?

It seems the \\., which captures the \" in the target, is not possessive, and is causing the error.

The following works in the given case. Are there cases where it fails?

print "NOT MATCHED!\n" unless /^ " (?: [^"\\]++ | (?: \\. )++ )*+ " /x ;

Update:

Playing around more, it seems that the possessives are not needed on the internals, but only that the \\. should have a quantifier:

print "NOT MATCHED!\n" unless /^ " (?: [^"\\]+ | (?: \\. )+ )*+ " /x ;
This works on a string of length 100 million, in <2s on my machine. (I didn't try any longer strings.)

-QM
--
Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re^2: What regex can match arbitrary-length quoted string?
by QM (Parson) on Sep 29, 2017 at 13:26 UTC
    Assuming this is correct...

    Is this a simple omission or transcription error in perlre?

    Besides the entry in perlre, where else is the original solution given?

    Should it be updated or annotated?

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re^2: What regex can match arbitrary-length quoted string?
by Anonymous Monk on Sep 29, 2017 at 13:51 UTC
    Your update doesn't solve the problem.
    $_ = '"' . ('x\"' x 100000) . '"'; print "NOT MATCHED!\n" unless /^ " (?: [^"\\]+ | (?: \\. )+ )*+ " /x; __END__ Complex regular subexpression recursion limit (32766) exceeded at foo +line 4. NOT MATCHED!
      Hmmm...I get "NOT MATCHED!", but no error message. But there are other inconsistencies.

      Putting several into a single script:

      #!/usr/bin/perl use strict; use warnings; our $quotes = '"' . ('\"' x 10000000) . '"'; print "Length of string is ", length($quotes), "\n"; our @qr; push @qr, qr/^ " (?: [^"\\]++ | \\. )*+ " /x ; push @qr, qr/^ " (?: [^"\\]++ | (?: \\. )++ )*+ " /x ; push @qr, qr/^ " (?: [^"\\]+ | (?: \\. )+ )*+ " /x ; for my $i (0..$#qr) { print "$i) $qr[$i] ==> "; print "NOT " unless $quotes =~ $qr[$i]; print "MATCHED!\n"; } __END__ Length of string is 20000002 Complex regular subexpression recursion limit (32766) exceeded at ./pm +1200316.pl line 16. 0) (?^x:^ " (?: [^"\\]++ | \\. )*+ " ) ==> NOT MATCHED! 1) (?^x:^ " (?: [^"\\]++ | (?: \\. )++ )*+ " ) ==> MATCHED! 2) (?^x:^ " (?: [^"\\]+ | (?: \\. )+ )*+ " ) ==> MATCHED!
      perl -v This is perl 5, version 22, subversion 1 (v5.22.1) built for x86_64-li +nux-gnu-thread-multi (with 58 registered patches, see perl -V for more detail) Copyright 1987-2015, Larry Wall

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

      Typo there: $_ = '"' . ('x\"' x 100000) . '"'; presumably should be $_ = '"' . ('\\"' x 100000) . '"';

      (Yes, it sometimes takes me a *long* time to read my email...)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1200367]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-04-19 14:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found