Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^5: Alternative to bytes::length() (7% solution)

by BrowserUk (Patriarch)
on Dec 23, 2009 at 13:40 UTC ( #814092=note: print w/replies, xml ) Need Help??


in reply to Re^4: Alternative to bytes::length() (7% solution)
in thread Alternative to bytes::length()

I believe that your 'utf8' case is mostly benchmarking the pulling out of the character count cached in the magic, thus completely missing the original problem.

I was mostly concerned with pointing out that it's important to consider exactly what you're benchmarking.

I'm also not convinced that there is enough of a description of "the original problem" to say whether I missed or not. I'm finding it quite hard to think of an application where the time taken to obtain the length of a string would be very significant?

Except maybe when sorting strings by length, at which point the caching would pretty much negate the first-time cost.

I don't see how your benchmark provides any justification for not using eq ''

Um...cos I modified the original benchmark and didn't think about it. Having just swapped the 'ne's for 'eq's, it does make a suprising difference. Though I haven't thought through whether that's down the efficiency of the operator or the change to the boolean logic.

It did surprise me greatly that ord worked out much faster than bytes::length.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^5: Alternative to bytes::length() (7% solution)

Replies are listed 'Best First'.
Re^6: Alternative to bytes::length() (7% solution)
by creamygoodness (Curate) on Dec 23, 2009 at 16:09 UTC

    A parser which consumes the string would be an example application.

    I agree that in many cases the caching is going to hide or eliminate the problem, and in fact, it was a little tricky to compose the original example.

    However, in code that doesn't need to know the actual length of the string in characters, it's poor practice to use length(), whose cost scales with the size of the string on SVf_UTF8 scalars. The advantage of bytes::length() is that it's basically a lookup on a member of the SV struct. There's a little bit of extra math to deal with offsets, but it's still O(1) rather than O(n).

    The ne "" idiom is also O(1) because it's just checking to see whether the total byte size of the string is 0, not counting characters and seeing if the count is 0. Which makes it an effective replacement.

      A parser which consumes the string would be an example application.

      Then It would be better to test for the success of the operation that consumes the string, than perform that operation and then test to see if there is any string left to consume. Simplistically:

      cmpthese -1, { a=>q[ 1 while chop( $sa ) ], b=>q[ chop( $sb ) while $sb ne '' ] };; Rate b a b 8537447/s -- -59% a 20591583/s 141% --

      But most other operations that alter (reduce) the length of a scalar can be tested in the same way.

      That said, if there is any reason why the reducing operation itself cannot be used as the controlling condition, just testing the string for truth achieves the desired effect even more efficiently:

      [0] Perl> cmpthese -1, { a=>q[ 1 while chop( $sa ) ], b=>q[ chop( $sb ) while $sb ne '' ], c=>q[ chop( $sc ) while $sc ] };; Rate b a c b 8686428/s -- -2% -20% a 8821000/s 2% -- -19% c 10841356/s 25% 23% --

      Though I realise that can fail under some conditions.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://814092]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2023-05-28 12:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?