Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

I was shocked to notice this thread is almost a month old already ... it's better to publish this at long last, be it "final" optimized version or not (I'm sure it can be improved a lot), before the thread is dead cold and whoever participated have to make effort to read their own code because of time elapsed

Thanks for posting. I suggest you just take your time and enjoy it without worrying too much about how old the thread is. This is actually one of my favourite features of Perl Monks, so much so that I wrote a meditation about it: Necroposting Considered Beneficial. :)

Your reply motivated me into finally getting around to installing Linux on my newish Windows laptop. Being lazy, I did this with a single wsl --install command to install the default Ubuntu distribution of Linux from the Microsoft Store. AFAIK, the main alternative is to install VMware, followed by multiple different Linux distros.

Anyways, after doing that I saw similar performance of both my fastest Perl and C++ versions.

For llil2grt.cpp on Windows 11:

llil2grt start get_properties CPU time : 4.252 secs emplace set sort CPU time : 1.282 secs write stdout CPU time : 1.716 secs total CPU time : 7.254 secs total wall clock time : 7 secs

On Ubuntu WSL2 Linux (5.15.79.1-microsoft-standard-WSL2), running on the same hardware, compiled with the identical g++ -o llil2grt -std=c++11 -Wall -O3 llil2grt.cpp, we see it runs slightly faster:

llil2grt start get_properties CPU time : 3.63153 secs emplace set sort CPU time : 1.09085 secs write stdout CPU time : 1.41164 secs total CPU time : 6.13412 secs total wall clock time : 6 secs

Sadly, it's becoming increasingly obvious that to make this run significantly faster, I'll probably have to change the simple and elegant:

hash_ret[word] -= count;
into something much uglier, possibly containing "emplace_hint" or other std::map claptrap ... and I just can't bring myself to do that. :) More attractive is to leave the simple and elegant one-liner alone and instead try to inject a faster custom std::map memory allocator (I have no idea how to do that yet).

Conversely, my fastest Perl solution llil2grt.pl runs slightly slower on Ubuntu:

llil2grt start get_properties : 10 secs sort + output : 22 secs total : 32 secs
compared to 10, 20, 30 secs on Windows 11. Perl v5.34.0 on Linux vs Strawberry Perl v5.32.0 on Windows.

Update: while this shortened version llil2cmd-long.pl runs a bit faster on Ubuntu (but not on Windows):

llil2cmd-long start get_properties : 7 secs sort + output : 21 secs total : 28 secs

The injection of CR into output lines is only required on Windows (actually, not required at all)

Yes, you're right, Windows nowadays seems perfectly happy with text files terminated with \n rather than the traditional DOS \r\n. By default, Perl and C++ both output text files with "\n" on Unix and "\r\n" on Windows. I'll update my test file generator to generate identical "\n" terminated files on both Unix and Windows.

Update: Test file generators updated here: Re^3: Rosetta Code: Long List is Long (Test File Generators). Curiously, \n seems to be slower than \r\n on Windows if you don't set binmode! I am guessing that chomp is slower with \n than with \r\n on a Windows text stream.

Update: Are you running Strawberry Perl on Windows? Which version? (Trying to understand why your Windows Perl seems slower than mine).

Update: The processor and SSD disk (see Novabench top scoring disks) on my HP laptop:

Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz, 1501 Mhz, 4 Core(s), 8 Logi +cal Processor(s) Disk (238 GB SSD): Intel Optane+238GBSSD (ave sequential read 513 MB/ +s; ave sequential write 280 MB/s) - score 73

References Added Later

Updated: Noted that llil2grt.pl on Ubuntu Linux runs slightly slower on Linux than Windows, along with detailed timings. Clarified that I'm running Windows 11. Added more detail on my laptop hardware. Mentioned llil2cmd-long.pl, developed later.


In reply to Re^2: Rosetta Code: Long List is Long (faster) by eyepopslikeamosquito
in thread Rosetta Code: Long List is Long by eyepopslikeamosquito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-20 05:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found