Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^12: threading a perl script

by vkon (Curate)
on Apr 25, 2011 at 19:19 UTC ( [id://901237]=note: print w/replies, xml ) Need Help??


in reply to Re^11: threading a perl script
in thread threading a perl script

Yes, you're correct -
I was assuming that 1) threads.xs and 2) fork - they both use same underlying engine - ithreads.
I even not bothered to look into implementation details, as it was just enough granulation of details for my particular purposes

Now you're stating that 1) my misunderstanding is manifest and 2) devil in details.

Now this boils down to one of things:

  • either, when looking into deep into implementation details I will see that 'fork' and 'threads.xs' are really different beasties, due to devils in details
  • or, looking into API that is followed by your link I will eventually understand deep differences between 'threads.xs' and 'fork'
what should break my wrong assumption?
I still fail to see a point of my misunderstanding.

Actually I am inclined to think that my phrases were not clear , they were too vague, due to my English language imperfection, and - so - I do not have deep misunderstanding.

I.e. I do not know about exact implementation details, but I do know that both are using 'ithreads'.

Replies are listed 'Best First'.
Re^13: threading a perl script
by BrowserUk (Patriarch) on Apr 25, 2011 at 20:22 UTC

    Anonymonk pointed out that:

    you're conflating unrelated things, fork is not implemented using threads module

    In reply to a post where you used a bug in Perl's Windows fork emulation as a justifiction for not using threads, nor even building your *nix perl with threading enabled.

    Your misunderstanding is to equate all bugs with the former, as bugs with the latter also. They most frequently are not. Here is a simplified explanation of that.

    Anyone that does not use the fork emulation, either because--like me--they do not see any advantage to it; or because--like you--they use an OS that doesn't need fork to be emulated, will not be affected by bugs in that emulation, even though they might use threads regularly.

    Although both are based around the concept of running a separate interpreter in a new thread--ithreads. They are implemented in quite different ways (with some common code), to achieve quite different things.

    When the code reference passed to thread->create() starts running, it is at the very root of the call-stack in its interpreter. That is, it inherits access to pre-existing code and data, but its call-stack, and all its other stacks, are empty. There are no pending returns to be returned from. No complex of unclosed scopes to be unwound. No half executed if statement that must be conditionally branched.

    Just a coderef that must be executed in the context of some preloaded code and data. Pretty much exactly the same as when main::main is called in a new, single threaded program. That is, all the use statements have been processed, all the BEGIN/CHECK/INIT blocks have been run; some namespaces, coderefs and data have been preloaded, and it is time to run the program.

    The only real differences between the 'new thread' case and the 'new process' case are: a) the entrypoint isn't called main:main; b) the namespaces, coderefs and data didn't need to be loaded from disk, source code parsed and tokenised, opcode trees built etc. A quick block copy and it is ready to go.

    Contrast that with the fork emulation case where the program is already in mid-flow. The program is usually in the middle of a if statement; often embedded within a half executed loop; usually several layers, and potentially dozens of layers of scope down. Potentially in a recursive call chain. Perhaps within a BEGIN or END block. Maybe embedded within several layers of to-be-unwound exception handling. Perhaps with unhandled pending signals, timers or whatever.

    In order for the fork to work, all of this state--half executed opcode stack, code stack, save stack, temp stack, curstack and mark stack--have to be captured and faithfully reproduced, in addition to the preloaded state that a new thread needs. And the scope [sic] for getting that lot wrong is far greater than simply copying the preloaded state required by a new thread.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      ok, somewhere deep inside it reveals that "fork" have more to clone on windows, threads.xs is 5% simpler to initialize, and also there is some RTFS chunk of code that makes great difference between them.

      good.

      Still, there is one huge common in them,

      I think this is larger common thing, as opposed to some implementation differences.

      Actually this unfortunate design has its historical explanations - there were 5005THREADS, which were developed in pre-5.6.0 perl and were designed to be lightweight.
      this stuck, fo rsome reasons, and 'ithreads' were added quickly as easier solution

      And this mean - perl have not efficient threads.
      You could still have usage of them to have a load on multiple CPUs, but perl is just not the right tool for the task, it is not efficient when we're talking about threads.

      Do you know how this threading is done in Python and Ruby, BTW?
      (I do not know, but may be you do?)

      PS: I use win32 perl more often than Linux one, although both are quite often.

        threads.xs is 5% simpler to initialize

        Man. Are you incapable of rational thought? (Scan forward to 12.25 seconds and watch for 4 minutes.)

        for sure, you already saw this famous article

        Yes. And I know that Liz didn't know what she was talking about.

        (Take a close look at the crap modules she littered the Thread::* namespace on cpan with if you doubt this. )

        And neither do you. Your misunderstanding (and her's) is manifest.

        Actually this unfortunate design has its historical explanations - there were 5005THREADS, which were developed in pre-5.6.0 perl and were designed to be lightweight. this stuck, fo rsome reasons, and 'ithreads' were added quickly as easier solution.

        And that is the final nail of proof in your 'threading expertise' coffin. You simply haven't taken the time, (nor any time it seems), to attempt to understand the subject on which you are pontificating.

        5005 threads were kicked into touch because it was impossible to make Perl's 'fat' data structures safe. Not just from user abuse, but even internally.

        Ithreads are not an 'evolution of 5005 threads.

        Do you know how this threading is done in Python and Ruby, BTW?

        Yes. Both implement forms of 'green threads', which is to say, user space threading. IN other words, they run as a single OS thread, and implement a (crude) internal scheduler within that single OS thread.

        Which means they rely upon a global interpreter locks. Which means they are very slow. Not just for access to shared state, but all state.

        And they do not scale.

        Because they use user space threading, to the OS they run under, they are a single threaded process. Which means they can only ever utilise one core at a time. No matter how many cores are available, or how many (pseudo)threads the program spawns, they are only ever going to execute one instruction at a time.

        Bottom line, in the words of Python users Python threads suck!. Read the entire thread. Look for the acronym "GIL". Try and understand.

        In the words of Ruby users, "Threads suck in Ruby. They are not native threads but are programatic threads in the ruby engine. If you have any thread that tries to open a socket, and the socket destination is not there, it will take a while for the socket to time out. During this wait, the whole ruby engine is frozen."

        Perl 5 is (to my knowledge) the only dynamic, interpreted language that supports true, concurrent, scalable (kernel-based) threading, Imperfect for sure, but far less imperfect than other dynamic languages. And far simpler and more productive to use than any compiled languages.

        This thread is more than deep enough. If you want to continue to display your ignorance of this subject, you can do so as a monologue.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://901237]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2024-04-23 22:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found