Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

<> oddity ?

by Krambambuli (Curate)
on Mar 27, 2013 at 10:08 UTC ( [id://1025665]=perlquestion: print w/replies, xml ) Need Help??

Krambambuli has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I just can't find the underlying reason for what I feel is a bit of an oddity and so I'm hoping that some wise monk might be able to shed light on it.

With a sample file named sample.txt that reads like
www1.example.com www2.example.com www3.example.com
and the code below, I'm blocked forever (on reading from STDIN ?!!) when running it like

$ ./weird.pl sample.txt

whereas

$ ./weird.pl <sample.txt

or

$ cat sample.txt | ./weird.pl

works perfectly OK. I'm trying to make a number of DNS requests in an async manner, controlling the maximum number of open sessions at any time and processing the names of the hosts to be checked in a 'stream-like' fashion.

I have work-arounds, so I'm not looking for alternate solutions, but I'm just trying to understand why this code doesn't work and blocks, apparently when hitting the end of the file given on the command line.

Am I doing something that isn't right or is Perl this time not keeping it's promesses about how files on the command line are processed ...?

Here's the code:
#!/usr/bin/perl use Data::Dumper; use IO::Select; use Net::DNS; my $timeout = 10; my $sel; # global IO::Select object my $max_resolver = 1; my $max_sockets = 2; my $max_sessions = $max_resolver * $max_sockets; my @resolvers; push( @resolvers, Net::DNS::Resolver->new) for 1..$max_resolver; my $resolver_counter = 0; my $socket_counter = 0; my $session_counter = 0; my $host_counter = 0; my %sockets = (); local $| = 1; while ( <> ) { chomp; print "INITIAL Line: $_\n"; if ($session_counter < $max_sessions) { # make a new DNS session my $resolver = $resolvers[ $resolver_counter ]; my $bgsock = $resolver->bgsend( $_, 'A' ); $sockets{ $bgsock } = $resolver; if (defined $sel) { $sel->add( $bgsock ); } else { $sel = IO::Select->new( $bgsock ); } ++$session_counter; if (++$socket_counter == $max_sockets) { $socket_counter = 0; if (++$resolver_counter == $resolver_counter) { $resolver_counter = 0; } } ++$host_counter; last if $session_counter == $max_sessions; } next; } print "OKOK \$ARGV: $ARGV\n"; while ($host_counter > 0) { my @ready = $sel->can_read( $timeout ); if ( scalar @ready > 0) { foreach my $sock (@ready) { my $resolver = $sockets{ $sock }; my $response = $resolver->bgread( $sock ); $sel->remove($sock); delete $sockets{ $sock }; --$host_counter; print "AAAA \$ARGV: $ARGV\n"; my $line = <>; print "BBBB \$ARGV: $ARGV\n"; if ($line ) { print "ADDITIONAL Line: $line" ; chomp $line; my $new_sock = $resolver->bgsend( $line, 'A' ); $sel->add( $new_sock ); $sockets{ $new_sock } = $resolver; ++$host_counter; print "CCCC \$ARGV: $ARGV\nHostcounter: $host_counter\n"; } } } else { warn "\n\ntimed out after $timeout seconds\n\n"; } } exit;
The issue clearly has something to do with the use of IO::Select or Net::DNS. If I'm 'cutting out' that part and do only the reading and printing, everything works just as expected.

Many thanks in advance.

Replies are listed 'Best First'.
Re: <> oddity ?
by pvaldes (Chaplain) on Mar 27, 2013 at 11:11 UTC

    I'm just trying to understand why this code doesn't work and blocks... I'm blocked forever

    Not, the program is not blocked, is simply waiting at the line 69 (my line = <>;) for your input. If you type something it continues. As expected.

      This is well spotted! If I modify my example to include this extra <>:

      while(<>) { print; } my $line = <>; print "END\n";

      then only

      perl stdin.pl stdin.pl

      waits for more input, while the other two versions finished immediately. That is still strange...

      UPDATE: I guess Perl will do unshift(@ARGV, '-') unless @ARGV; (copied from LanX above) and "re-open" STDIN because @ARGV was exhausted in the while loop already. When using pipes or redirections this logic will not be applied and <> will only return EOF or similar.

        It is explained in the post of Lanx

        In the first case the program is waiting for the user to type a value for the variable $line; in the second and third cases a default value is provided instead, $line = '-';

        > That is still strange...

        Is it really?

        I mean what is this code actually supposed to do?

        Let's say it clearly: Reading from an exhausted filehandle is nonsense!

        Just the fact that this filehandle is magic leads to undefined behavior, because interactice input can't be exhausted.

        Using either STDIN or ARGV here to is not only better, it explicitly demonstrates the intention of this line.

        Cheers Rolf

        ( addicted to the Perl Programming Language)

      As expected... hmm. Define a sample.txt like
      1 2 3
      And then run $ test.pl sample.txt

      where test.pl is
      #!/usr/bin/perl use strict; use warnings; my $i = 0; while (<>) { print $_; last if ++$i >= 2; } my $line; print "My extra line: $line" while $line = <>;
      Why would this program then NOT wait for any further input via STDIN ?
        > Why would this program then NOT wait for any further input via STDIN ?

        Perl can't read your mind, if you want to read from STDIN, explicitly use STDIN instead of stretching DWIM magic behavior till it breaks.

        UPDATE:

        actually your example is reading from STDIN, but maybe you should stop the redirecting from file to STDIN if you wanna read from interactive input (keyboard).

        see select

        Cheers Rolf

        ( addicted to the Perl Programming Language)

Re: <> oddity ?
by LanX (Saint) on Mar 27, 2013 at 10:26 UTC
    this

    $ ./weird.pl sample.txt

    is fundamentally different to

    $ ./weird.pl <sample.txt or $ cat sample.txt | ./weird.pl

    the first passes a filename as argument while the latter two pipe the content of this file into STDIN.

    See ARGV for a way to solve this.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      If you call the code below stdin.pl,

      while(<>) { print; }

      then all three commands below result in the same output:

      $ perl stdin.pl stdin.pl $ perl stdin.pl < stdin.pl $ cat stdin.pl | perl stdin.pl
        you're right

        from perlop

               The null filehandle <> is special: it can be used to emulate the
               behavior of sed and awk.  Input from <> comes either from standard
               input, or from each file listed on the command line.  Here’s how it
               works: the first time <> is evaluated, the @ARGV array is checked, and
               if it is empty, $ARGV[0] is set to "-", which when opened gives you
               standard input.  The @ARGV array is then processed as a list of
               filenames.  The loop

        while (<>) { ... # code for each line }

        is equivalent to the following Perl-like pseudo code:

        unshift(@ARGV, '-') unless @ARGV; while ($ARGV = shift) { open(ARGV, $ARGV); while (<ARGV>) { ... # code for each line } }

        Cheers Rolf

        ( addicted to the Perl Programming Language)

      I'm aware of that, thanks. But *should* my program work nevertheless or not ?

      So far I don't realize why it shouldn't. I just can see that it doesn't, and I'm trying to understand *why*.

      Thank you.
        OK ... sorry ... I never needed that magic ...

        Your code is too long for me to spot the reason, you should try shortening it till you isolated the problem.

        Maybe one of your modules (like IO::Select ) clutters something?

        Cheers Rolf

        ( addicted to the Perl Programming Language)

        I still do not know why your code does not work. I wanted to reply to LanX only. Apologies for cluttering your thread. I'll be silent now.

Re: <> oddity ?
by hdb (Monsignor) on Mar 27, 2013 at 11:00 UTC

    Just one more thought... The difference between the first and the other two methods is that in the first you read directly from file whereas in the others you go via the shell somehow. The latter might modify your file. So I am thinking whether there is something in your file sample.txt causing the problem? Such as a missing newline or some non-ASCII, non-printing character? (No further noise from my side...)

Re: <> oddity ?
by hdb (Monsignor) on Mar 27, 2013 at 10:17 UTC
    while(<>)

    reads from STDIN. Try this script:

    while(<>) { print; }

    This should behave similar to yours under your 3 scenarios.

    $ ./weird.pl sample.txt whereas $ ./weird.pl <sample.txt or $ cat sample.txt | ./weird.pl
    UPDATE: I am stupid and should read "Programming Perl" again. Apologies for posting useless stuff. No idea what goes wrong with your code.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1025665]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-19 12:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found