Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^14: threading a perl script

by vkon (Curate)
on Apr 26, 2011 at 06:40 UTC ( [id://901283]=note: print w/replies, xml ) Need Help??


in reply to Re^13: threading a perl script
in thread threading a perl script

ok, somewhere deep inside it reveals that "fork" have more to clone on windows, threads.xs is 5% simpler to initialize, and also there is some RTFS chunk of code that makes great difference between them.

good.

Still, there is one huge common in them,

I think this is larger common thing, as opposed to some implementation differences.

Actually this unfortunate design has its historical explanations - there were 5005THREADS, which were developed in pre-5.6.0 perl and were designed to be lightweight.
this stuck, fo rsome reasons, and 'ithreads' were added quickly as easier solution

And this mean - perl have not efficient threads.
You could still have usage of them to have a load on multiple CPUs, but perl is just not the right tool for the task, it is not efficient when we're talking about threads.

Do you know how this threading is done in Python and Ruby, BTW?
(I do not know, but may be you do?)

PS: I use win32 perl more often than Linux one, although both are quite often.

Replies are listed 'Best First'.
Re^15: threading a perl script
by BrowserUk (Patriarch) on Apr 26, 2011 at 07:41 UTC
    threads.xs is 5% simpler to initialize

    Man. Are you incapable of rational thought? (Scan forward to 12.25 seconds and watch for 4 minutes.)

    for sure, you already saw this famous article

    Yes. And I know that Liz didn't know what she was talking about.

    (Take a close look at the crap modules she littered the Thread::* namespace on cpan with if you doubt this. )

    And neither do you. Your misunderstanding (and her's) is manifest.

    Actually this unfortunate design has its historical explanations - there were 5005THREADS, which were developed in pre-5.6.0 perl and were designed to be lightweight. this stuck, fo rsome reasons, and 'ithreads' were added quickly as easier solution.

    And that is the final nail of proof in your 'threading expertise' coffin. You simply haven't taken the time, (nor any time it seems), to attempt to understand the subject on which you are pontificating.

    5005 threads were kicked into touch because it was impossible to make Perl's 'fat' data structures safe. Not just from user abuse, but even internally.

    Ithreads are not an 'evolution of 5005 threads.

    Do you know how this threading is done in Python and Ruby, BTW?

    Yes. Both implement forms of 'green threads', which is to say, user space threading. IN other words, they run as a single OS thread, and implement a (crude) internal scheduler within that single OS thread.

    Which means they rely upon a global interpreter locks. Which means they are very slow. Not just for access to shared state, but all state.

    And they do not scale.

    Because they use user space threading, to the OS they run under, they are a single threaded process. Which means they can only ever utilise one core at a time. No matter how many cores are available, or how many (pseudo)threads the program spawns, they are only ever going to execute one instruction at a time.

    Bottom line, in the words of Python users Python threads suck!. Read the entire thread. Look for the acronym "GIL". Try and understand.

    In the words of Ruby users, "Threads suck in Ruby. They are not native threads but are programatic threads in the ruby engine. If you have any thread that tries to open a socket, and the socket destination is not there, it will take a while for the socket to time out. During this wait, the whole ruby engine is frozen."

    Perl 5 is (to my knowledge) the only dynamic, interpreted language that supports true, concurrent, scalable (kernel-based) threading, Imperfect for sure, but far less imperfect than other dynamic languages. And far simpler and more productive to use than any compiled languages.

    This thread is more than deep enough. If you want to continue to display your ignorance of this subject, you can do so as a monologue.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      At least for Python, your assertion is partially incorrect. The GlobalInterpreterLock "prevents multiple native threads from executing Python bytecodes at once", but (other than that), Python threads are native OS threads. You don't get as much benefit from these threads while running inside of Python, but at least C extensions (and it seems also blocking system calls) can release the GIL to other code.

        You're right. It is much worse than that.

        It spawns native threads, but then uses them like grean threads. Except that the internal 'scheduler' is so poor that it has to run in perceptual debug mode. Thus totally preventing any form of usable concurrency.

        Its like saying: Of course every one can have the privilege of owning their own car, sit in their own private space, listening to their own choice of music. But to go anywhere, you have to wait for the car-train to arrive, join the queue and follow it wherever it goes until (if you are lucky) it goes past your exact destination. At which point you can leave.

        It has all the costs with none of the benefits.

        I guess it does serve to highlight how difficult it is to implement kernel threading in a dynamic language; and just how f***ing amazing iThreads really are. Imperfect for sure. But still amazing, given that they were retro-fitted without breaking backward compatibility and are so usable and scalable.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Actually I am aware of occupying your precious time by explaining to me things. (no sarcasm, actually)
      these are useful, thanks.

      I never said that 'ithreads' was evolution of 5005threads,

      I was surprised to read that And I know that Liz didn't know what she was talking about.

      Well, I learned something new and useful today,
      :) :)

      I never stated that I know much threads details, but I thought I understand some basics and drawbacks.
      o-kay, maybe my understanding needs to be adjusted, in a sence that threads are more usable than I initially thought.

      Now I wonder - are Tcl threads any better?
      I will look at http://www.tcl.tk/doc/howto/thread_model.html and http://www.tcl.tk/doc/howto/thread_model.html and will let you know.
      Actually some things in Tcl is done better - compared to Perl (starkits, GUI and virtual file systems), but - for me at least - Perl is much better to use.

      Ok, now I am away reading articles about python and ruby threading pointed by you,
      for which I especially grateful.

        Actually I am aware of occupying your precious time by explaining to me things

        My time isn't particularly precious these days, and I'm happy to talk about threading. But like Stephen Fry, I do get rattled a little by someone who just keeps repeating the same thing over and over. ("Ceiling|"; "They're heavy!")

        iThreads are "heavy" compared to kernel threads in low-level languages like C. This is undeniable. But that isn't a reason to either reject them or decry them. Trucks are heavy compared to cars, but that's because they need to be to perform their function. So it is with iThreads.

        And so it is with integers in Perl. They are heavy compared to integers in low-level languages like C.

        On my 64-system: In C, a million (native) integers occupies 8MB. In perl, those same million integers occupy 32MB.

        But we forgive Perl that weight because of all the value-add that weight gives us. Numbers that transparently convert themselves from text to binary, integer to floating point, and back again as the program demands.

        No mess of text to binary conversion routines: _tstof(), atof(), _wtof(), _tstoi(),  atoi(), _wtoi(), _tstoi64(), _atoi64(), _wtoi64(), _tstol(), atol(), _wtol(), _ttoi(), _wtoi(), _ttoi64(), _atoi64(), _wtoi64(), _ttol(), atol(), _wtol() and corresponding mess of binary to text conversion routines: _itoa(), _i64toa(), _ui64toa(), _itow(), _i64tow(), _ui64tow(), _ultoa(), _ultow(), _ultoa_s(), _ultow_s().

        We recognise the costs of Perl's dynamic nature, and recoginise that they are outweighed by the benefits it brings to the simplicity of its programs and productivity of its programmers.

        Similarly, the 'weight' of iThreads brings huge benefits.

        • Sheer simplicity.

          For many, many uses, a single keyword: async() is all you need to get access to true, scalable concurrency

        • Safety.

          Need to do two things concurrently? Throw one of them into an async{ ... }; block and you've got it. Done and dusted.

          No fears about accidental sharing. (Mostly) no need to lock anything. No deadlocks, livelocks or priority inversions to think about. It just works,

        • Usability.

          For the common, simple cases, adding concurrency to your single threaded code is almost trivial.

        • Perl has them.

          Do not under estimate how unique a dynamic language with usable, scalable, workable threading is.

        As implemented, iThreads are not perfect by any stretch of the imagination. There are many ways in which that implementation could be improved. But for dynamic languages, the iThreads model is the only way to go, and will be much copied in the future.

        Indeed, I believe that when the dust settles, compiled languages will provide threading that offers a similar model. Ie. Explicit-only shared state. The benefits of shared state concurrency, but only when you want it. It is almost common sense really. Sharing everything is just so dangerous.

        Human beings are very bad at remembering what they've used and where. But compilers are single-minded and relentless at tracking such details. So, let the compiler yell at us and stop working if we start accessing the same data from multiple threads. Have it force us to mark shared data as such and so force us to think about the requirements and consequences of doing so. Then, anything not explicitly marked is safe, requires no locking, and runs at full speed.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://901283]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (9)
As of 2024-04-23 09:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found