No such thing as a small change | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Please feel free to respond away with solutions... Short version: J script runs in ~5.7 sec using ~650 MB to produce exactly same output. Which makes it fastest of all solutions so far. (The caveat: "words" are supposed to be not too different in length, they are temporarily padded to longest during program run (i.e. not padded at all with current test sample), expect deterioration if, say, their lengths differ 2 or more orders of magnitude or so, especially if just a few are very long. I didn't check.) ###################### (Prose and previous results, can be skipped) I was shocked to notice this thread is almost a month old already. While I'm in no hurry and have been pursuing what follows at leisure and only rarely (kind of "for dessert"), it's better to publish this at long last, be it "final" optimized version or not (I'm sure it can be improved a lot), before the thread is dead cold and whoever participated have to make effort to read their own code because of time elapsed. As a reference frame, here are assortment of results of previous solutions, with my hardware: llil2grt.pl (see 11148713) (fastest "pure core-only Perl"), Windows:
Same code & data & PC, Linux: (I didn't investigate why such difference.)
I assume that Judy (see 11148585) is best in both speed and memory for non-parallel Perl solutions, with same caveat as at the very top: "words" are temporarily padded to fixed length e.g. here to 10 bytes.:
Being lazy bum, I didn't compile C++ solutions (nor do I code in C++), here is a copy-paste from 11148969, I assume it is the best result so far, among C++ and all others: (For my PC, I expect time to be worse.)
###################### (End of prose and previous results) Code below generates next message with RAM usage taken from Windows Task Manager (to be on par with how it was measured for Perl), while script pauses for a key (Ctrl-D or Ctrl-Z + Enter combo as usual for Linux or Windows, respectively) after finish:
The injection of CR into output lines is only required on Windows (actually, not required at all) to later ensure no difference with output from Perl. The "magic constant" 3 for number width can be any, and is only used for intermediate step. I had to make this code slightly less readable than it was during development, by somewhat aggressively re-using over and over again same variable names for words and nums, as data are processed and modified as script progresses. They were different "self-explanatory" names at each stage, but because arrays are huge, it's better to immediately over-write variable on successive assignments to conserve memory. "Erasing" throw-away helper arrays (similar to undef in Perl) serves same purpose. Actually, during development I was playing with this toy dataset, here's original data and result: text =: noun define tango 1 charlie 2 bravo 1 foxtrot 4 alpha 3 foxtrot 1 bravo 1 foxtrot 7 ) NB. Do work here... ] text foxtrot 12 alpha 3 bravo 2 charlie 2 tango 1 The script:
In reply to Re: Rosetta Code: Long List is Long (faster)
by Anonymous Monk
|
|