Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^3: Bidirectional lookup algorithm? (Judy)

by BrowserUk (Patriarch)
on Jan 14, 2015 at 22:02 UTC ( [id://1113271]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Bidirectional lookup algorithm? (Judy)
in thread Bidirectional lookup algorithm? (Updated: further info.)

I finally managed to build a 64-bit version of Judy.

I started with this one-file/1250 line version and hacked out all the -DSTANDALONE and -DASKITIS stuff along with all the BIG_ENDIAN stuff; extracted a Judy.h; and got the filesize down to 965 lines and built it into a dll:

C:\test\Judy>cl /W3 /Ot /favor:INTEL64 /MT /LD Judy.c Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64 Copyright (C) Microsoft Corporation. All rights reserved. Judy.c Microsoft (R) Incremental Linker Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. /out:Judy.dll /dll /implib:Judy.lib Judy.obj Creating library Judy.lib and object Judy.exp

I then wrote a C program to us it to create two Judy arrays and stored my test data 'aaaaa'..'zzzzz' paired with a 64-bit integer:

built it against the dll:

C:\test\Judy>cl /W3 /Ot JudyTest.c Judy.lib Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64 Copyright (C) Microsoft Corporation. All rights reserved. JudyTest.c Microsoft (R) Incremental Linker Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. /out:JudyTest.exe JudyTest.obj Judy.lib

A run:

C:\test\Judy>JudyTest.exe aaaaa Check memory: Bidi lookup of 11881376 pairs took: 6.325 seconds Check memory: 524,332k

Then I built it as an Inline::C module, adding method wrappers for the important functions:

Unfortunately, in this form, the runtime increase -- mostly I think due to the perl->C->perl transitions -- from 6.5 seconds to over 25s:

C:\test\Judy>perl Judy.pm -ODO=aaaaa Memory before building Judy: 10,760 K Memory after building Judy: 347,660 K Bidi lookups of 11881376 pairs took:25.197204113 seconds Memory after lookups: 347,680 K

So, whilst it does use somewhat less memory than my BiMap version; is also somewhat slower.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^4: Bidirectional lookup algorithm? (Judy)
by syphilis (Archbishop) on Jan 15, 2015 at 00:49 UTC
    I finally managed to build a 64-bit version of Judy

    For the record, I finally managed to build a static 64-bit library, too. (Credit to BrowserUk and oiskuu.)
    oiskuu suggested that my problems with non-constant expressions arose because the compiler chose not to fold constant expression where undefined behaviour is involved - and that I should use (Word_t)0x100 instead.
    Doing that allowed the build to proceed, and 'make check' passed its tests.

    However, I was still seeing a number of "left shift count >= width of type" warnings, and I don't know how thorough the test suite (which completes very quickly) is.
    I therefore don't have a lot of confidence that the library is going to behave reliably in the real world.

    With the Judy-0.41 patches that anonymonk posted I was able to successfully build the module on 32-bit perls.
    However, the Judy module that I built against the 64-bit library loaded ok, but kept crashing during the running of its test suite.
    In Judy.xs, I changed the hard coded "long" casts to "long long" casts. That got rid of lots of warnings during the build, but the crashing persisted.
    The generated Judy.c file still contained some casts to "unsigned long int" , but I couldn't immediately find what was generating those casts - at which point I totally lost interest.

    Cheers,
    Rob
      For the record, I finally managed to build a static 64-bit library, too.

      Well done. And thank you for trying. Was that with Mingw or MSVC?

      I never managed to get beyond the Too early to specify a build action 'installdeps'. Do 'Build installdeps' instead. idiocy using MSVC.

      However, the Judy module that I built against the 64-bit library loaded ok, but kept crashing during the running of its test suite. In Judy.xs, I changed the hard coded "long" casts to "long long" casts. That got rid of lots of warnings during the build, but the crashing persisted. The generated Judy.c file still contained some casts to "unsigned long int" , but I couldn't immediately find what was generating those casts - at which point I totally lost interest.

      It's very capable code, but the very epitome of 'macro abuse'. The only way to work out what code is actually being compiled is to use /E and wade through the post processed source, but by then, you've no idea where the code came from; or what combination of #defines, #ifdefs and #ifndefs caused it.

      Like trying to listen to music by looking at an oscilloscope trace.

      That's why when I came across the 1 file version I linked, I jumped at it. It at least gave me a way to get a quick feel for the performance and memory usage -- both of which are very impressive -- until you have to make the Perl->XS->C and back transitions for every lookup :(

      Its not just the extra layers of function calls -- which I've tried to negate by forcing the functions to be inlined -- it also comes down to the fact that Perl code messes up the code and data caches as it leaps about all over memory chasing opcode trees. As the basis of Judy Array performance is the minimisation of cache misses, screwing with the cache between each access to those arrays, pretty much undoes everything the author worked so hard to achieve.

      Also, I think that my need to use two Judy arrays in parallel does nothing to enhance the performance.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
        Was that with Mingw or MSVC?

        That was a "./Configure" build in the msys shell using MinGW (Strawberry Perl's gcc-4.8.2).

        Like trying to listen to music by looking at an oscilloscope trace

        Nice analogy :-)
        Thanks for elaborating, also.

        Cheers,
        Rob

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1113271]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (6)
As of 2024-04-23 17:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found