Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Missing (i)threads Features?

by liz (Monsignor)
on Aug 20, 2003 at 10:13 UTC ( [id://285148]=perlmeditation: print w/replies, xml ) Need Help??

The other day, BrowserUK mentioned that he felt still a few features were missing from the Perl ithreads implementation. Fortunately, I was able to quickly provide a solution to the feature he was missing (check out Thread::Running if you're interested).

So far I've recorded these missing features:

Being able to set the priority of a Perl thread.
On Win32 this would involve SetPriority and friends, on *nix this would involve nice(). Object is to come up with a transparent interface that would work on both types of systems and would still allow access to all features.
Find out whether yield() is a no-op or not
This would allow you to warn the user if a threaded application that uses yield() will burn a lot of CPU or not.
Suspend/Resume a thread
It can be useful to be able to create threads and suspend them pending some event. Without knowledge of how threads are implemented on other platforms I can't say much more on this. (suggested by BrowserUK).
Debugging threads
Use debugger like you are able to do without threads. (suggested by BrowserUK).
die(), warn() and Carp.pm include thread information
Currently, you don't know from which thread a message originated (suggested by BrowserUK)

I was wondering whether maybe other people miss features in Perl threads as well. If you do, please let them know. Maybe something already exists on CPAN that does the trick. Or maybe it is simple to implement and abstractify (is that a word?) into a module.

Any valid suggestions I will add to the above list.

Liz

Replies are listed 'Best First'.
Re: Missing (i)threads Features?
by BrowserUk (Patriarch) on Aug 20, 2003 at 16:06 UTC

    A few thoughts so far

    1. Set/GetThreadPriority / Nice.

      Nice appears to have a range of possible value +20 to -20, with positive values lowerig priority and negative raising it. This is true for processes. How are threads implemented under *nix? Can their priorities be manipulated independently of their process?

      Win32 uses a range of 0 to +31, but this is abstracted into a number of classes IDLE, BELOW_NORMAL, NORMAL, ABOVE_NORMAL, HIGH and REALTIME. (Not all are available on all platforms). There is also a scheme for dynamically boosting a threads priority for short periods.

      I've no idea what equivalents are available under VMS/MAC etc.

      For the most part, most OSs are probably best left to manage such matters for themselves, but I can see the desirability of designating a thread to run only when nothing else needs the cpu and also for marking a thread to run immediately when an asynchronous event unblocks.

      I think that abstracting this into a set of 3 or 4 'levels' makes sense. It would take some expertise on each of the different OSs to translate the levels into appropriate numerical values for each platform.

    2. Yield.

      Rather than havong a facility to detect whether yield actually yields, it would be better if yield where implemented to do something sensible--like sleep for short periods--on those platforms where the OS doesn't provide the facilty native. This is probably a simplistic view, but the ought to be some approximation possible on most platforms?

    3. Suspend/Resume.

      It can be useful to be able to create threads and suspend them pending some event. Without knowledge of how threads are implemented on other platforms I can't say much more on this.

    4. Thread::Running is nice, but one of the main uses I had for this is still non-trivial to code.

      The situation is the 'pool of threads' scenario where you want to have the main thread block until one of the current threads has finished. Effectively, I find myself wanting to code

      if( @threads > LIMIT ) { threads->join_any(); } else { push @threads = threads->new( \&childStuff, @args ); }

      This is the threaded equivalent of wait (as opposed to waitpid) in the forking world.

      A minor critique of your current implementation of Thread::Running etc. is that I have to supply a list of threads to be checked. The 'system' already knows what threads there are, and it shouldn't be necessary for me to supply a list to running(), tojoin(), exited(). It becomes a pain to remember to remove threads that you have joined from the list before passing it back in. It requires me to duplicate the hash that you are using internally.

    5. Debugger support for threads needs improvement.

      One of the biggest advantages of multi-threading over multi-processing, is that (with appropriate debugger support) it is 'easier' (for some meaning of that term) to debug if all the threads of execution are in a single process and managable from a single dubugger session.

      It should be possible to single step, set breaks and watches, view source around the current point on a thread by thread basis. The debugger prompt should indicate which thread is executing each line when tracing etc.

      This is one place where the ability to suspend/resume threads can be useful.

    6. die, warn and Carp should all include the thread id in their output by default in threaded applications.

    Most of this stuff would be better, more easily and more efficiently done in threads.xs rather than requiring several extra modules each with its own state which (often) duplicates state already existing at the system level. Perhaps the powers that be would accept a threads::util module for this sort of stuff.

    I'm accutely aware that I am (legitimately) open to the critisism "Okay. Where are the patches?". The only response can give is "I'm working on it", but it going slowly:(

    To date, I still haven't succeeded in building perl with a free compiler such that it isn't emasculated (no large file support or PerlIO for example).


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      ... I think that abstracting this into a set of 3 or 4 'levels' makes sense...

      I agree.

      ...There is also a scheme for dynamically boosting a threads priority for short periods...

      Periods determined by what? Time? An event? (Un)Availability of a resource?

      ...yield actually yields, it would be better if yield where implemented to do something sensible--like sleep for short periods--

      Hmmm... that's easily done if you're able to determine reliably that yield() is a no-op on your platform.

      ...create threads and suspend them pending some event...

      You could do something like that on Linux with Thread::Signal by creating a signal handler that, when called, will block on a variable, and another signal handler that resets that variable. Hmmm... I feel a module coming up... ;-)

      ...The situation is the 'pool of threads' scenario where you want to have the main thread block until one of the current threads has finished.

      I'm not sure what you mean, but are you sure this cannot be handled with Thread::Pool or Thread::Queue::Monitored or Thread::Conveyor::Monitored ?

      A minor critique of your current implementation of Thread::Running etc. is that I have to supply a list of threads to be checked

      That was the case in 0.01 ;-). In 0.02, the default for running(), tojoin() and exited() are all of the threads that Thread::Running knows about. Please RTFM ;-).

      Debugger support for threads needs improvement.

      I agree. I've created two debugging support modules for threads. Thread::Status which shows you where each thread is at a given time (unfortunately, only on Linux because of dependency on Thread::Signal) and Thread::Deadlock which will tell you where threads are when a deadlock occurs.

      ...view source around the current point on a thread by thread basis...

      I could add this as an option to Thread::Status.

      ...die ,warn and Carp should all include the thread id in their output by default in threaded applications.

      I actually submitted a patch for this before 5.8.0. The problem is that many tests depend on the format of die, warn and Carp, and it was breaking all sorts of tests ;-( At least, that's how I remember it (a lot has happened since). But I agree, yes!

      Most of this stuff would be better, more easily and more efficiently done in threads.xs rather than requiring several extra modules each with its own state which (often) duplicates state already existing at the system level. Perhaps the powers that be would accept a threads::util module for this sort of stuff.

      Patches welcome ;-). But not for 5.8.1, I'm afraid. Jarkko would go ballistic if anyone would start submitting patches for threads related stuff for inclusion in 5.8.1... ;-)

      To date, I still haven't succeeded in building perl with a free compiler such that it isn't emasculated (no large file support or PerlIO for example).

      ??? on what system?

      Liz

        Periods determined by what? Time? An event? (Un)Availability of a resource?

        The way this works is:

        If a thread is designated as being in a 'dynamic priority class' and 'priority boosting' is enabled, then when the thread is 'woken' after a wait state (asynchronous IO completion, timer expiry or the like), then the OS temporarially boosts it's priority.

        The effect is that when whatever thread (system wide rather than process wide) that was running when the asynchronous event occured, completes it's timeslot, the thread that was waiting on that event has a higher (but not exclusive) chance of being selected as the next thread to be run by the scheduler. The threads priority returns to its unboosted level the next time it is run.

        This provides a mechanism for the application to increase the probability that time-critical threads run as soon as possible once they have something to do without risking overriding higher priority threads elsewhere in the system as would be the case if this was done in a completely deterministic way.

        That's a poor destription, but may be good enough.

        That's easily done if you're able to determine reliably that yield() is a no-op on your platform.

        I guess what I am saying here is that if a platform has a sensible alternative to a native yield(), then it would be better if that alternative was provided by the threads module by default rather than requiring an extra piece of code that every user needs to use to determine if they need to add another piece of code to work around that fact that it isn't available by default, and then force everyone to come up with their own alternative implementation.

        Commensurate with that, it would be better to have the yield() function raise an exception on those platforms that do not implement it rather than needing to add on a module to detect that it is a no-op.

        ... on Linux with Thread::Signal...

        The problem with that is that it won't work on platforms that don't support signals. It is also adding layers on top of the threads API in a way that makes it difficult to code cross-platform. The extra layers also make for very complicated debugging.

        I'm not saying that it isn't the right way to do it on Linux, only that at the architectural level, if it is going to be possible to write cross-platform threaded applications in perl, then the API's available need to be integrated at higher level than can really be done by add-ons written at the perl level.

        In this case, on Win32, SuspendThread and ResumeThread translate directly to native APIs. Then only thing missing is access to the native thread handle. It's actually quite easy to obtain this handle as there is a native API that will return a usable handle for the currently running thread, which may or may not be the same as the native handle stored within the threads module.

        The problem comes to trying to devine what effect using this to suspend a thread externally to the threads module will have upon the perls management of that thread? Maybe if the thread is suspended, nothing untoward will happen, but maybe Perl will take umbrage at having one of it's threads rendered un-runnable without it being consulted:)

        With regard to your many Thread::* modules. The problem (from my Win32 platform perspective) is that many of them rely entirely upon platform-dependant idioms like Signals. I can see how to implement most, if not all of them on the Win32 platform, but not in a way that would make them compatible with the modules you have already written. This would make life very tough for anyone trying to utilise them for cross-platform development.

        This is where trying to find a platform independant iThreads API set would be useful. If (we?) could take a step back from specific implementations and try and look at what features would be useful to have and try to encapsulate those features within an API that would enable them to implemented reasonably efficiently on all (or at least most) platforms, then it might be possible to come up with something that would be truely usable.

        ...many tests depend on the format of die, warn and Carp...

        Perhaps the solution here is for the patch(es) to die, warn and Carp to detect whether threads were being used and only use the modified form if they were? That would leave existing non-threaded testcases unaffected.

        Patches welcome ;-). But not for 5.8.1...

        Rather than patching the existing threads module directly, I was thinking more of coming up with a single threads::util module that would have access to the internal structures (native thread handles etc.) that encapsulated all of the new APIs. Whether that was later adopted or integrated could be left for powers-that-be to decide later on.

        on what system?

        Win32. I tried MingW, Borland and LCC so far. I get furthest with Borland, but its STDIO libraries don't have huge file (>4GB) support.

        Theoretically, using PerlIO ought to work around this, but as currently implemented (for Win32 at least), PerlIO still relies on STDIO calls for fseek(), fstat() and ftell(). I ought to be able to work around this by modifying the Win32_... verions of the three APIs in win32.c to use the native APi equivalents rather than the C-library versions, but this is not trivial as they all require access to the

        FILE *
        handle that is their first parameter and this is not exposed by the C-libraries.

        I've searched the web hoping that Borland or someone else would have already released extensions to accomodate huge files, without success. I've also looked to see if the source code for the libraries was available to no avail. I'm currently trying to reverse engineer the FILE * structure, but as many of the fields are bit-fields, and the structure is used ubiquitously through out the libraries and these libraries are in turn used extensively when building perl, reverse engineering it and all the constants required to access and manipulate them is a distinctly non-trivial process.

        The bigggest barrier to making changes to the perl build process is the incredibly conveluted and incestuous nature of the build mechanisms.

        (D|N)MAKE is used to build mini-perl, which is used to process header files into a config which is used to create makfiles which use perl to create header files which are processed using perl to .....

        I think I got lost somewhere in a maze of dark twisty passages. Combine that with the conditional compilation directives in every source module to account for all the disperate platform differences and it is amazing that perl manages to build anywhere. That it is seccessful in doing so on so many platforms is truely remarkable and a testimony to losts of hard work by lots if people. The problem is that the hands of all those individuals show and the while thing is now so complex that it a huge investment of time and expertise to even begin to make inroads into the process.

        I have had lots of time, and whilst my skill levels may not be upto the standards of those that precede me, I'm not exactly clueless. I've spent several months trying to get a handle on the while thing and still it defeats me.

        Even where I have made so progress and have seen things that I might like to do, I am reluctant to try because I feel that making further piecemeal changes is likely to simply perpectuate and add to the problems I see. And each time someone, somewhere makes changes and additions that add to the complexity, it further removes the whole process from the 'ordinary man or woman', by which I mean thise that don't have the luxury of unlimited time and high levels of expertise to devote.

        Step by step, this means that fewer and fewer people will ever be able to contribute to the perl codebase and leave more and more people dependant upon the time, skills and motivation of fewer and fewer experts. From my persepctive, this is tantamount to the death-knell for an open source project.

        I think I understand why Perl 6 is going for a clean sweep rather than trying to further extend the existing. I think that the greatest service that could be performed for the Perl 5 source is for someone to discard the existing build process and start again. I've had 2 or 3 attempts at starting this process, but you need to be a computer to be able to resolve all the paths and inter-dependancies. Trying to unwind them is nearly impossible for a human being--at least it is for this human being. I'm afraid I've had to give up on that approach.

        Please excuse any typos, but hopefully everything is understandable. Life is too short to pander to anal-retentive pedants who specialise in braying about the syntax instead of concentrating in the symantics.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        If I understand your problem, I can solve it! Of course, the same can be said for you.

Thread::Suspend uploaded
by liz (Monsignor) on Aug 22, 2003 at 13:36 UTC
    Just uploaded Thread::Suspend to my personal CPAN directory. From the POD:
    NAME
           Thread::Suspend - suspend and resume threads from another thread
    
    SYNOPSIS
               use Thread::Suspend;             # exports suspend() and resume()
               use Thread::Suspend qw(suspend); # only exports suspend()
               use Thread::Suspend ();          # threads class methods only
    
               my $thread = threads->new( sub { whatever } );
               $thread->suspend;                # suspend thread by object
               threads->suspend( $thread );     # also
               threads->suspend( $tid );        # suspend by thread ID
               threads->suspend;                # suspend all (other) threads
    
               $thread->resume;                 # resume a single thread
               threads->resume;                 # resume all suspended threads
    
    Unfortunately, this depends on Thread::Signal, so it currently only works on Linux.

    Liz

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://285148]
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-19 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found