go ahead... be a heretic | |
PerlMonks |
Re^4: Powerset short-circuit optimizationby Limbic~Region (Chancellor) |
on Oct 26, 2006 at 13:28 UTC ( [id://580758]=note: print w/replies, xml ) | Need Help?? |
jimt,
I am sad to tell you that despite providing an iterative version that need not be called more than necessary, it is terribly slow. I timed (not Benchmarked) 4 versions and unfortunately this wasn't even a contender:
The code to generate the 3_477 line data file and the recursive java version can be found at How many words does it take?. The two recursive perl versions are below:
They both share the following code:
The 13 second version has subsets() as The 28 second version has subsets() as
I made minor modifications to your code to handle my dataset as well as produce comparable output:
Update 1: After your 3rd update, your code finished in a respectable 78 seconds. Unfortunately it is still producing about 40% duplicates. Additionally, it doesn't produce the correct output (missing missing 92_835 strings out of 508_062). For instance 'cdglnst' does not appear at all in your output. Update 2: After your 4th update, your code narrowly makes 3rd place with 26 seconds and correct output! I included the entire perl script I am using above to ensure we are comparing apples to apples. Admittedly, yours does scale much better with both speed and memory. Unfortunately, it still isn't quite up to the task I needed. I will have to put this in my back pocket for later though. Cheers - L~R
In Section
Seekers of Perl Wisdom
|
|