http://qs321.pair.com?node_id=218095


in reply to Re: meaning of /o in regexes
in thread meaning of /o in regexes

Just a note: none of this applies if you are using qr the way it's meant - as the entire regex for m// or s/// as in $qr = qr/./; $_ = 'abc'; m/$qr/; s/$qr//; $_ =~ $qr. All of these more normal uses of expressions benefit from the precompilation. This note is about interpolating qr objects into other regular expressions which is different.


Starting from the top: I created the short sample program and then dumped it's opcode tree to see what it actually does. From this I can say that interpolating qr objects into another regular expression saves nothing. The objects are all concatenated (meaning stringification) and then compiled for the regex. If you add the /o modifier to any m// or s/// operation then it binds the compiled form to that location in hte opcode tree. There is no reason for that to change just because you used a qr in the regex or not. If you read Dominus' remarks on that at Dirty Secrets of the Perl Regex Engine then that will be clear.

The answers to your questions (in order):

  1. I don't know
  2. no (you are penalized)
  3. no (you are penalized)
  4. the same thing it always does
  5. no (you are penalized)
The penalizing is from having to do a magic_get on the qr ops instead of just reading it as a string and then the overall penalty of doing work more than once (compile the regex for qr, mg_get the stringified form, then compile the larger regex). Or at least that's how I read it. Please correct me if I'm wrong - I am still quite a novice at this.

$qr = qr/./; 'a' =~ /$qr$qr/; __DATA__ C:\>perl -MO=Concise qr.pl e <@> leave[t1] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 5 qr.pl:1) v ->3 5 <2> sassign vKS/2 ->6 3 </> qr(/./) s ->4 - <1> ex-rv2sv sKRM*/1 ->5 4 <> gvsv s ->5 6 <;> nextstate(main 5 qr.pl:3) v ->7 d </> match() vKS ->e 7 <$> const(SPECIAL Null)[t5] s ->8 c <|> regcomp(other->d) sK/1 ->d 8 <1> regcreset sK/1 ->9 >> This is where you see the two [qr] expressions >> being fetched as global scalar values, >> concatenated and *then* just above this the >> regex is compiled. b <2> concat[t4] sK/2 ->c - <1> ex-rv2sv sK/1 ->a 9 <> gvsv s ->a - <1> ex-rv2sv sK/1 ->b a <> gvsv s ->b

I'm working off of the three references http://perl.plover .com/Rx/, and perlop (the gory quoting part. See also pp_hot.c for pp_concat which doesn't do anything special for qr magic. It's just strings at that point.

__SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;

Replies are listed 'Best First'.
Re: Re^2: meaning of /o in regexes
by BrowserUk (Patriarch) on Dec 06, 2002 at 17:00 UTC

    Thankyou diotalevi++. That is exactly the sort of answer I was looking for and it confirms my suspicions based on some fairly dodgey benchmarking.

    No matter how hard I tried to isolate the benefits of qr//'ing or /o'ing, those benefits always seemed to disappear whenever I attempted to combine one or more pre-compiled regexes with each other or with some non-compiled stuff. In fact, I sometimes detected a penalty from using pre-compiled regexes other than stand-alone, though the differences were too small to quantify with any accuracy.


    Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
    Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
    Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
    Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

      I looked even further and the get magic (see sv.c Perl_sv_2pv and seek to the "Regexp" section) associated with stringifying a qr regex is actually pretty cheap. I'd guess any real performance loss is just from having to compile a regex more than once which unless you are doing some monster regex... isn't all that much of an issue.

      __SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;

        The task that started me on the quest for speed was indeed a monster regex, it was also being called many times in a tight loop. As the regex was essentially repetitious, I originally thought that compiling the base regex with qr// and then using that in conjuction with a repeat count and the none repeating elements also compliled with /o might leach some benefits, but the reverse was true. I settled for programically generating the regex (using x n) into a single large regex and then compiling it with qr// (which appeared to give some slight performance benefit over /o). This was possibly due to the fact that when compiled with qr//, you can use the resultant variable directly ($string =~ $compiled_re) rather than needing to embed it within an m// operator (m/$compiled_re/). Maybe its slightly quicker to execute the former than the latter? The difference seemed significant enough to make it worthwhile..


        Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
        Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
        Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
        Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.