comment on

The good news is... The bad news is...

The good news is that I bring good news only! :) Modified J script is faster, more versatile, uses significantly less RAM, and has been tested with 9.04 engine to parallelize obvious low hanging fruits for additional speed boost.

NB. -----------------------------------------------------------
NB. --- This file is "llil4.ijs"
NB. --- Run as e.g.:
NB.
NB. jconsole.exe llil4.ijs big1.txt big2.txt big3.txt out.txt
NB.
NB. --- (NOTE: last arg is output filename, file is overwritten)
NB. -----------------------------------------------------------

pattern =: 0 1

NB. ========> This line has a star in its right margin =======> NB. *

args   =: 2 }. ARGV
fn_out =: {: args
fn_in  =: }: args

NB. PAD_CHAR =: ' '

filter_CR       =: #~ ~: & CR
make_more_space =: ' ' I. @ ((LF = ]) +. (TAB = ])) } ]
find_spaces     =: I. @: = & ' '

read_file =: {{
  'fname pattern' =. y

  text =. make_more_space filter_CR fread fname
  selectors =. (|.!.0 , {:) >: find_spaces text

  width  =. # pattern
  height =. width <. @ %~ # selectors

  append_diffs =. }: , 2& (-~/\)
  shuffle_dims =. (1 0 3 & |:) @ ((2, height, width, 1) & $)

  selectors =. append_diffs selectors
  selectors =. shuffle_dims selectors

  literal  =. < @: (}:"1) @: (];. 0)        & text "_1
  numeric  =. < @: (0&".) @: (; @: (<;. 0)) & text "_1
  extract  =. pattern & {
  using    =. 1 & \
  or_maybe =. `

  ,(extract literal or_maybe numeric) using selectors
}}

read_many_files =: {{
  'fnames pattern' =. y

  ,&.>/"2 (-#pattern) ]\ ,(read_file @:(; &pattern)) "0 fnames  NB. *
}}

'words nums' =: read_many_files fn_in ; pattern

t1 =: (6!:1) ''         NB. time since engine start

'words nums' =: (~. words) ; words +//. nums                    NB. *
'words nums' =: (\: nums)& { &.:>"_1 words ; nums
words =: ; nums < @ /:~/. words

t2 =: (6!:1) ''         NB. time since engine start

text =: , words ,. TAB ,. (": ,. nums) ,. LF
erase 'words' ; 'nums'
text =: (#~ ~: & ' ') text
text fwrite fn_out
erase < 'text'

t3 =: (6!:1) ''         NB. time since engine start

echo 'Read and parse input:    ' , ": t1
echo 'Classify, sum, sort:     ' , ": t2 - t1
echo 'Format and write output: ' , ": t3 - t2
echo 'Total time:              ' , ": t3
echo ''
echo 'Finished. Waiting for a key...'
stdin ''
exit 0
[download]

Code above doesn't (yet) include any 9.04 features and runs OK with 9.03, but I found 9.04 slightly faster in general. I also found 9.04 a bit faster on Windows, it's opposite to what I have seen for 9.03 (script ran faster on Linux), let's shrug it off because of 9.04 beta status and/or my antique PC. Results below are for beta 9.04 on Windows 10 (RAM usage taken from Windows Task Manager):

> jconsole.exe llil4.ijs big1.txt big2.txt big3.txt out.txt
Read and parse input:    1.501
Classify, sum, sort:     2.09
Format and write output: 1.318
Total time:              4.909

Finished. Waiting for a key...

Peak working set (memory): 376,456K
[download]

There are 3 star-marked lines. To patch for 9.04 new features to enable parallelization, replace them with these counterparts:

{{ for. i. 3 do. 0 T. 0 end. }} ''
  ,&.>/"2 (-#pattern) ]\ ,;(read_file @:(; &pattern)) t.'' "0 fnames
'words nums' =: (~.t.'' words) , words +//. t.'' nums
[download]

As you see, 1st line replaces comment, 2nd and 3d lines require just minor touches. 2nd line launches reading and parsing of input files in parallel. 3d line says to parallelize filtering for unique words and summing numbers according to words classification. Kind of redundant double work, even as it was, as I see it. The 1st line starts 3 additional worker threads. I don't have more cores with my CPU anyway, and this script has no work easily dispatched to more workers. Then:

Read and parse input:    0.992
Classify, sum, sort:     1.849
Format and write output: 1.319
Total time:              4.16
[download]

I would call my parallelization attempt, however crude it was, a success. Next is output for our second "official" dataset in this thread:

> jconsole.exe llil4.ijs long1.txt long2.txt long3.txt out.txt
Read and parse input:    1.329
Classify, sum, sort:     0.149
Format and write output: 0.009
Total time:              1.487
[download]

########################################################

These are my results for latest C++ solution (compiled using g++), to compare my efforts to:

$ ./llil2vec_11149482 big1.txt big2.txt big3.txt >vec.tmp
llil2vec start
get_properties      CPU time : 3.41497 secs
emplace set sort    CPU time : 1.04229 secs
write stdout        CPU time : 1.31578 secs
total               CPU time : 5.77311 secs
total        wall clock time : 5 secs

$ ./llil2vec_11149482 long1.txt long2.txt long3.txt >vec.tmp 
llil2vec start
get_properties      CPU time : 1.14889 secs
emplace set sort    CPU time : 0.057158 secs
write stdout        CPU time : 0.003307 secs
total               CPU time : 1.20943 secs
total        wall clock time : 2 secs

$ ./llil2vec_11149482 big1.txt big2.txt big3.txt >vec.tmp
llil2vec (fixed string length=6) start
get_properties      CPU time : 2.43187 secs
emplace set sort    CPU time : 0.853877 secs
write stdout        CPU time : 1.33636 secs
total               CPU time : 4.62217 secs
total        wall clock time : 5 secs
[download]

I noticed that new C++ code, supposed to be faster, is actually slower (compared to llil2grt) with "long" dataset from two "official" datasets used in this thread.

In reply to Re^3: Rosetta Code: Long List is Long (faster - vec)(faster++, and now parallel) by Anonymous Monk
in thread Rosetta Code: Long List is Long by eyepopslikeamosquito

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Think about Loose Coupling
	PerlMonks