http://qs321.pair.com?node_id=745527


in reply to Re^5: threads on Windows
in thread threads on Windows

Have you ever heard of the concept of a reduced testcase?

I'm sorry... when I did the work I was trying to find what was going "wrong" with the OP... so I aimed to disturb the underlying code as little as possible.

But with a litle more time on my hands, it being Saturday... see below.

Did you see my post where I mentioned that "the OS serialises read and write accesses to console devices."?

I did, thank you. And, as I said, when I tried a file attached to STDIN the "problem" went away. So, yes, the effect appears to be peculiar to the console.

However:

So... the serialization of console read and write access doesn't appear to be the issue.

Have you consider the effect of all your verbose logging, and tracing, and "determining the state of STDIN" is having upon the outcome you are seeking to establish?

I have.

Which is why the trace is gathered in memory and not output until the end. It does introduce a small critical region in the trace gatherer... but that should only affect the progress of the threads (which is arbitrary in any case) when they are runnable. So if my means of observing the problem has affected it, I'm blessed if I can see how.

I'm not sure what you mean by "determining the state of STDIN"... but if you mean the test to see if STDIN was open/closed after starting the "Terminal Watcher": then I can tell you that the main result is the same with or without that piece of code.


Reduced test case code:

use strict; use warnings; use threads; use Thread::Queue; use Time::HiRes qw(time) ; my $START = time() ; my @t = () ; push @t, [time(), "main thread start"] ; my $s_q = Thread::Queue->new ; async { $s_q->enqueue(time()) ; $_ = <STDIN> ; $s_q->enqueue(time()) ; }->detach() ; if (@ARGV) { # option to close STDIN close STDIN ; push @t, [time(), "closed STDIN"] ; } ; threads->yield() ; # make sure STDIN thread starts push @t, [time(), "about to start child thread"] ; my $c_q = Thread::Queue->new ; async { $c_q->enqueue(time()) ; }->detach() ; threads->yield() ; # make sure child thread starts push @t, [time(), "about to collect results"] ; push @t, [$s_q->dequeue(), "STDIN thread started"] ; push @t, [$s_q->dequeue(), "STDIN thread received input"] ; push @t, [$c_q->dequeue(), "child thread ran"] ; threads->yield() ; # give threads opportunity to complete push @t, [time(), "main thread completes"] ; foreach my $t (sort { $a->[0] <=> $b->[0] } @t) { printf "@ %9.6f: %s\n", $t->[0] - $START, $t->[1] ; } ;
which on my uni-processor, Windows XP, Perl 5.10.0, gives:
Z:\>perl threads_r.pl
12345678901234567890
@  0.000006: main thread start
@  0.014843: STDIN thread started
@  0.015520: about to start child thread
@  9.814112: STDIN thread received input
@  9.824127: child thread ran
@  9.834141: about to collect results
@  9.834517: main thread completes
but if the main thread closes STDIN:
Z:\>perl threads_r.pl close
12345678901234567890
@  0.000004: main thread start
@  0.008159: closed STDIN
@  0.008267: STDIN thread started
@  0.008845: about to start child thread
@  0.017748: child thread ran
@  0.022810: about to collect results
@ 10.725422: STDIN thread received input
@ 10.735437: main thread completes
(where the italics is me pecking at the keyboard to introduce a delay...)

FWIW, here's what my Linux box (also uni-processor, Perl 5.10.0) does:

[GMCH@hestia ~]$ perl threads_r.pl
012345678901234567890
@  0.000012: main thread start
@  0.008300: STDIN thread started
@  0.008525: about to start child thread
@  0.024331: child thread ran
@  0.028648: about to collect results
@ 11.298085: STDIN thread received input
@ 11.298319: main thread completes
and closing STDIN makes no difference.


My hypothesis is: when creating the child thread the main thread is blocked because whatever "clones" STDIN cannot do so while some other thread is waiting on it -- where STDIN is attached to the console.

I can see how that could be related to the serialisation of access to the console device. Nevertheless, it is a bit of a surprise. Happily, closing STDIN does not block and does close the handle -- so there is a work around.

Replies are listed 'Best First'.
Re^7: threads on Windows
by kennethk (Abbot) on Feb 21, 2009 at 22:44 UTC

    I'm glad to see continued discussion. I managed to create what I believe to be the simplest test case for this behavior:

    use strict; use warnings; use threads; use Thread::Semaphore; my $gag = new Thread::Semaphore(0); async { $gag->down; while (<>) { print; } }->detach; #close STDIN; # Comment this line for race-condition behavior $gag->up; my $value = async { return 1; }->join;

    The proper behavior for this code should be to immediately return control to the OS. In its race-condition configuration under Win32, it will usually wait for one input and parrot it back and sometimes two inputs.

    Based on your earlier post, I assume your 5.10 can function w/o the use of semaphores. I'll play with it a bit more to see if the addition of instrumentation to an independent log file affects the buffering behavior, but it seems like waiting for input on a shared file handle is blocking thread creation, i.e. what you said.

      It's not a "race condition"! (And it not "<> blocking thread creation".)

      You cannot clone the current thread whilst another thread has a (Perl internal) lock on one of the resources to be cloned. This is WAD (Working As Designed). Perl protects it's internal data structures from concurrency corruption by serialising access to them through an internal mutex. If one of the process global resources (PGR) of a spawning thread is currently in use by another thread then the cloning of the current thread will be blocked until that mutex is released. In this case, the shared PGR is STDIN--but it could be any other PGR that blocks.

      In the following, if you supply an argument, STDIN will be closed in the main thread, hence it does not need to be cloned in order for thread 2 through 11 to be spawned, and the program will complete immediately.

      If you do not supply an argument, STDIN is not closed, and when the attempt is made to clone the main thread for thread 2, the internal mutex is being held by thread 1 because it is in a blocking read state on STDIN. Therefore, the main thread is blocked until that read state is satisfied.

      If you hit enter, the read on STDIN in thread 1 returns, lifting the mutex and allowing the main thread to continue. A few threads will be created before thread 1 gets another timeslice, and again enters a blocking read state whilst holding the internal mutex. The main thread is once again blocked. Hit enter again and the cycle repeats.

      See the two sample runs after the __END__ token:

      #! perl -slw use strict; use threads; async { getc while 1; }->detach; close STDIN if @ARGV; for ( 1 .. 10 ) { async { printf "Thread: %d ran\n", threads->tid; }->detach; } sleep 1 while threads->list( threads::running ); __END__ C:\test>junk Thread: 2 ran Thread: 3 ran Thread: 4 ran Thread: 5 ran Thread: 6 ran Thread: 7 ran Thread: 8 ran Thread: 9 ran Thread: 10 ran C:\test>junk 1 Thread: 2 ran Thread: 3 ran Thread: 4 ran Thread: 5 ran Thread: 6 ran Thread: 7 ran Thread: 8 ran Thread: 9 ran

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Blocking while waiting for a lock is not blocking? You're not making sense.

      OK. Your test is much shorter than mine :-)

      I'd recommend adding threads->yield() (or sleep()) before the async for the child -- to ensure the "STDIN thread" has reached the <>. (There is a race condition there.)

      ... I assume your 5.10 can function w/o the use of semaphores ...

      Yes. It doesn't seem to matter on my system whether the close STDIN runs before or after the "STDIN thread" reaches the <> (I've tried it both ways).

      But I cannot say whether that's a 5.10 thing, or because I'm running a uni-processor (or both, or neither).