Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Trying to Understand the Discouragement of Threads

by salva (Canon)
on Nov 18, 2014 at 10:44 UTC ( #1107555=note: print w/replies, xml ) Need Help??


in reply to Trying to Understand the Discouragement of Threads

Threads have a big issue: they are not encapsulable.

You can not blindly create a thread in the middle of your program because doing it duplicates all the structures in the program memory. If your script is using 1Gb of memory and you create a thread, it immediately goes to 2Gb and if then you create another thread, it goes to 3Gb, etc. Replicating those data structures may also be quite expensive in terms of CPU usage.

For instance, imagine you want to query several HTTP servers in parallel. Threads look like a good match for that, so you build a module using threads, you test it and it runs fine. But then, when you use it from some data processing script that holds big datasets in memory everything goes nuts.

With the current thread support in Perl you can only use them at the high level designing your program around.

  • Comment on Re: Trying to Understand the Discouragement of Threads

Replies are listed 'Best First'.
Re^2: Trying to Understand the Discouragement of Threads
by BrowserUk (Patriarch) on Nov 18, 2014 at 11:32 UTC

    But:

    1. it doesn't stop them being very useful.
    2. it doesn't need to be that way.

    So, rather than throwing their hands up and saying, "We can't be bothered to work out how to use them, so you probably shouldn't."; they could ... say ... improve them.

    (It's actually quite trivial to start a new thread running a completely, clean, empty interpreter ....)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      (It's actually quite trivial to start a new thread running a completely, clean, empty interpreter ....)

      How? Is there any module which does that?

      Actually, I would prefer something that would allow me to start an empty interpreter, run some initialization code inside and then, clone it on-demand and run arbitrary code on the clones (I toyed a couple of times in the past with the idea of doing something like that myself... but never got the tuits).

        Ostensibly, starting a new interpreter in a second thread is as simple as starting one in your main thread.

        And whatever you load into that interpreter will be completely separated and isolated from any other thread+interpreter you happen to have running -- so long as you stay away from process global entities, like the environment, filehandles etc. And that's where things start to get complicated.

        You can even get creative and switch the interpreter contexts around, so that you effectively break the 1 thread tied to 1 interpreter link.

        Running isolated interpreters is relatively trivial, the problems arise when it comes to communicating between them, and sharing memory. And that's where the iThreads coders went "wrong"(*). They opted to use a 'clone everything' implementation, which whilst meeting their brief -- to provide a fork emulation on Windows -- creates all (literally all) of the perceived problems that have been (incorrectly) labeled as the "heaviness of threads". The heaviness lies entirely with the attempt to provide Copy-On-Write semantics without the use of Copy-On-Write OS support. It means you have to copy everything up front just in-case it gets written to. (Which is all the more dumb frustrating as Windows does actually support Copy On Write memory!)

        The second big problem with Ithreads, namely the size cost of shared memory, is equally frustrating, because it is equally fixable!

        For example, when you create a shared array, a (bog standard) array is allocated in 'shared space'; and then each thread that accesses it gets a tied array that 'redirects' reads and write to the shared array. But the dumb bit is that the tied array in user threads consist of a tied (standard) perl array of tied scalars. Tied scalars (with their associated magic) are bigger than most ordinary scalars; which means that the 'placeholder' tied array is often much bigger than the actual shared array! Which is a nonsense.

        The entire tied array could consist of a single blessed scalar containing a reference to the shared array. Full stop. When a FETCH or STORE (or any other tied array method) is invoked, the arguments provide all the required information to allow the shared array to be access and/or modified. Imagine how much lighter (and faster) shared arrays would be, if each thread only required a single shared scalar placeholder, instead of a full array of (fat) shared scalars!

        Likewise for hashes; And %ENV; and for filehandles; and all other process global entities.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        I haven't used it, but I think threads::lite does just that, spawn a mostly separate interpreter with a fresh (OS) thread. It also sets up some communication channels between the main thread and the new thread, but it shares nothing.

        Judging from the caveats in the documentation, loaded modules stay loaded, but all local namespacess with imports seem to be wiped. So it's not a completely clean interpreter but roughly equivalent.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1107555]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2022-08-15 04:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?