Failing tests of Test::TCP on Windows

bojinlund has asked for the wisdom of the Perl Monks concerning the following question:

Hej!

The reason for this post is the message

"This patch was applied at 1.13. (Note, but it's still failing.)"
[download]

in https://rt.cpan.org/Public/Bug/Display.html?id=67292#txn-1187167.

There is a summary of the failing tests on Windows below. (There are very few fails on other operating systems.) Hopefully we can help the maintainer of the module with the remaining problems in Windows!

Background

"Bug #66016 for Test-TCP: Tests fail on Windows (even when they pass?)" https://rt.cpan.org/Public/Bug/Display.html?id=66016.

"Bug #66437 for Test-TCP: Tests are blocking in Windows 7" https://rt.cpan.org/Public/Bug/Display.html?id=66437.

"Bug #67292 for Test-TCP: Tests are blocking in Windows 7. With a prposed patch." https://rt.cpan.org/Public/Bug/Display.html?id=67292.
Problem: The tests in Test::TCP are blocking. Sometimes they get the system in state, so it must be restarted.

The purpose of the (proposed and used) patch is to:

Reduce the frequency of problems when using kill on a pseudo-proccess in Windows.
To avoid to use kill on a pseudo-process in the test of Test-TCP.

Se also http://www.gossamer-threads.com/lists/perl/porters/261805 and Proposal how to make modules using fork more portable.

Summary of the failing test of Test::TCP

The pattern used is:

Test files: List of failing SUBtests. The digit in parentheses indicates the number of times the subtest has failed for version 1.13 - 1.18 of Test::TCP on mswin32 (data from http://matrix.cpantesters.org/?dist=Test%3A%3ATCP).

In "mswin32" and Test::TCP version 1.13 there are 0 fail/29 pass tests. Version 1.14 => 1/8, 1.15 => 3/13, 1.16 => 6/10, 1.17 => 0/19 and 1.18 => 5/53. For version 1.13 - 1.18 there are 15 failing tests and 124 passing.

Test report pattern: Pattern in the test report.

1) Test returned 9 even if all subtests passed

Test files: t/03_return_when_sigterm.t, t/04_die.t, t/06_nest.t(2), t/09_fork.t

Test report pattern:

Dubious, test returned 9 (wstat 2304, 0x900)
All x subtests passed
[download]

This problem is treated in https://rt.cpan.org/Public/Bug/Display.html?id=66016. The conclusion is that there is a probability that Windows returns 9 instead of the correct zero. One failing subtest in about 140 test runs is a little bit to high!?

2) Child process does not block

Test files: t/04_die.t(4), t/06_nest.t(4)

Test report pattern:

[Test::TCP] Child process does not block(PID: -xxxx, PPID: xxxx) at ..
+. \lib/Test/TCP.pm line 121.
t/xxxx.t ................. ok
[download]

TCP.pm (line 121 marked with comment):

sub start {
    my $self = shift;
    if ( my $pid = fork() ) {
        # parent.
        $self->{pid} = $pid;
        Test::TCP::wait_port($self->port);
        return;
    } elsif ($pid == 0) {
        # child process
        $self->{code}->($self->port);
        # should not reach here
        if (kill 0, $self->{_my_pid}) { # warn only parent process sti
+ll exists
            warn("[Test::TCP] Child process does not block(PID: $$, PP
+ID: $self->{_my_pid})"); # line 121
        }
        exit 0;
    } else {
        die "fork failed: $!";
    }
}
[download]

3) Target machine actively refused

Test files: t/01_simple.t(2), t/10_oo.t(8)

Test report pattern:

[Test::TCP] Child process does not block(PID: -xxxx, PPID: xxxx) at ..
+. /Test/TCP.pm line 121.
Cannot open client socket: No connection could be made because the tar
+get machine actively refused it. at t/10_oo.t line 21.
# Looks like you planned 22 tests but ran 20.
# Looks like your test exited with 10061 just after 20.
t/xxxx.t ................... 
Dubious, test returned 77 (wstat 19712, 0x4d00)
Failed 2/22 subtests
[download]

Test-TCP-1.16/t/10_oo.t line 21:

my $sock = IO::Socket::INET->new(
    PeerPort => $server->port,
    PeerAddr => '127.0.0.1',
    Proto    => 'tcp'
) or die "Cannot open client socket: $!";
[download]

4) Cannot open client socket

Test files: t/09_fork.t(4)

Test report pattern:

[Test::TCP] Child process does not block(PID: -xxxx, PPID: xxxx) at ..
+. lib/Test/TCP.pm line 121.

#   Failed test 'socket is connected'
#   at t/09_fork.t line 35.
# Cannot open client socket: 
# Looks like you planned 6 tests but ran 5.
# Looks like you failed 1 test of 5 run.
t/09_fork.t ................. 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 2/6 subtests
[download]

Test-TCP-1.16/t/09_fork.t line 28:

        # after the child has exited, we need to make sure that
        # the server hasn't gone away.
        my $sock = IO::Socket::INET->new(
            PeerPort => $port,
            PeerAddr => '127.0.0.1',
            Proto    => 'tcp'
        );
        if (! ok $sock, "socket is connected") { #line 35
            return diag("Cannot open client socket: $!");
        }
[download]

Questions

Any ideas of the reasons for the failing tests?
Is there a more portable way to implement the test of Test::TCP?
How should the failing tests be changed?
Need Test::TCP to be changed, to be more portable?
What are the consequences of the "none-deterministic" behaviour of Windows?
Is it enough to run the test of Perl modules just once or how many times are they needed to be run?
Is this only a problem with the implementation of Perl 5?
Or are the similar problems with Perl 6 and Python on Windows?

Regards

Bo Johansson

Comment on Failing tests of Test::TCP on Windows Select or Download Code

Replies are listed 'Best First'.
Re: Failing tests of Test::TCP on Windows by tobyink (Canon) on Mar 05, 2013 at 09:52 UTC
I can't answer the Test::TCP-related questions, but with regard to "is it enough to run the test of Perl modules just once or how many times are they needed to be run?" I can venture an opinion. The fact that a test passes once, does not prove it will always pass. Take a look at this... `$ date Tue Mar 5 09:30:54 GMT 2013 $ perl ~/tmp/scratch.pl 1..1 ok 1 - DateTime returns a value for "second" method $ perl ~/tmp/scratch.pl 1..1 ok 1 - DateTime returns a value for "second" method $ perl ~/tmp/scratch.pl 1..1 not ok 1 - DateTime returns a value for "second" method # Failed test 'DateTime returns a value for "second" method' # at /home/tai/tmp/scratch.pl line 7. # Looks like you failed 1 test of 1.` [download] What was the test case? Spot the obvious error... `use strict; use warnings; use Test::More tests => 1; use DateTime; ok( DateTime->now->second, 'DateTime returns a value for "second" method', );` [download] That's right, the test doesn't check the definedness of `DateTime->now->second`; it tests its truthfulness. When the clock hits an even minute, the `second` method correctly returns 0. However, datetime-specific issues and race conditions can be very unpredictable. There are also all kinds of subtle bugs that can crop up running test cases in parallel with each other or in a random order. These kind of errors are quite rare, and you may need to run the test suite hundreds of times to catch them. So in most cases the cost-benefit ratio doesn't justify running the test suite multiple times as part of the installation process. Usually better would be a continuous integration policy, where the test suite is run before each code checkin, with test case regressions noted in the commit message. (And ideally should pass the suite before checkins are merged into the master/default/trunk codebase.) A policy like this should catch a large number of these sporadic errors. `package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name`	[reply] [d/l] [select]
Re: Failing tests of Test::TCP on Windows by Corion (Patriarch) on Mar 05, 2013 at 09:56 UTC
IMO, trying to mix sockets and fork on Windows is fraught with problems. It's great when it works, but usually when external communication or sharing of resources gets involved (like with sockets), the problems outweigh the perceived benefits, especially in a `Test::` module.	[reply] [d/l]
Re^2: Failing tests of Test::TCP on Windows by bojinlund (Monsignor) on Mar 06, 2013 at 07:55 UTC
corion thanks for the answer! I suppose that �sharing of resources� in your answer means sharing resources between threads within one Windows process. As threads is implemented using Windows threads, it has then the same problems as fork!? My conclusion is then that you need two Windows process to test a TCP-connection. Or is there any other possibility to use threads? I have difficulties to understand what you should avoid using threads in Windows. Are there any documentation explaining threads in Windows from a Perl perspective? Which type of resources are problematic to shared between Windows threads?	[reply]
Re^3: Failing tests of Test::TCP on Windows by Corion (Patriarch) on Mar 06, 2013 at 08:08 UTC
The semantics of threads are far easier to reason about, as closing a handle in one thread will close the handle for the complete process for example. When using fork(), this would be unexpected, but when using the fork emulation on Windows, the same happens. This is obvious when you keep in mind that on Windows, fork emulation is implemented through threads, but programmers using fork() do usually not come from Windows and don't think of fork behaving differently.	[reply]


P is for Practical
	PerlMonks