Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Does Net::Ping have concurrency issues?

by isotope (Deacon)
on May 30, 2003 at 17:38 UTC ( [id://261902]=perlquestion: print w/replies, xml ) Need Help??

isotope has asked for the wisdom of the Perl Monks concerning the following question:

Net::Ping works fine in a loop over a list of targets, but does not seem to do what I want if I run several sessions simultaneously.

Warning: long post follows

I'm trying to write a script to discover hosts on a local subnet. In the past, they were discovered with ping(1) to the subnet broadcast address like so:
my @lines = `ping $self->{broadcast_ip} -b -c2 2>&1 | grep \'bytes fro +m\'`;
...with the IP addresses parsed out of those lines, etc.
However, some of the devices on the subnet do not respond to broadcast pings, but I would still like to discover them. They will all respond to direct pings. One approach would be to ping each possible address in the subnet. If nearly every address was occupied, this would take little time, however, this network is sparsely populated, so such an approach would spend minutes waiting for timeouts on unpopulated addresses.
I thought I'd be smart and fork off a client process for each address. In theory, I could compress the total time to one timeout period (which could be as little as 5 seconds, given a local subnet), plus a slight incremental time spent forking each child, plus a 25ms delay per child to prevent network flooding. This takes less than 20 seconds, even with an empty network, versus as long as 254 * 5 = 1270 seconds (more than 20 minutes) for doing sequential pinging. That sure makes forking attractive.
Here's what I tried to do:
#!/usr/bin/perl -w use strict; use Net::Ping; use Time::HiRes qw(usleep); use Fcntl qw(:DEFAULT :flock); use POSIX qw(tmpnam mkfifo); my $master = $$; # Create the FIFO my $fifo; do { $fifo = tmpnam(); } until mkfifo($fifo, 0666); # Generate list of hosts to ping my @targets; foreach my $target (84..87) { push(@targets, '10.100.19.'.$target); } # Make the pinger my $pinger = Net::Ping->new('icmp', 5); # Fork off processes my %kids; foreach my $target (@targets) { my $kid = fork; if($kid) { # Parent $kids{$kid} = $target; } else { # Child print "Child $target started\n"; if($pinger->ping($target)) { # PING! Throw it in the FIFO print "Found $target!\n"; sysopen(FH, $fifo, O_WRONLY | O_APPEND) or die "Can't open FIFO $fifo: $!\n"; print "Locking FIFO\n"; flock(FH, LOCK_EX) or die "Can't lock FIFO $fifo: $!\n"; print "appending to FIFO\n"; print FH $target."\n"; close(FH); } else { warn "$target: $!\n"; print "No response from $target\n"; } $pinger->close(); print "Child $target exiting\n"; exit(); } # Sleep 25 ms to prevent flooding usleep(25_000); } # Cleanup processes and gather results my @pings; print "Parent opening FIFO\n"; sysopen(FIFO, $fifo, O_RDONLY | O_NONBLOCK) or die "Can't open FIFO $fifo for reading: $!\n"; print "Parent looping over remaining kids\n"; while(%kids) { print "Parent looping wait\n"; while((my $kid = wait()) > 0) { print "Parent reaping $kids{$kid}\n"; delete($kids{$kid}); print "Parent reading fifo\n"; while(defined(my $line = <FIFO>)) { chomp($line); print "Parent got $line!\n"; push(@pings, $line); } } } print "Parent closing fifo\n"; close(FIFO); foreach my $ping (@pings) { print "PONG: $ping\n"; } # Delete the FIFO whenever we exit END { if($$ == $master) { unlink($fifo) or die "Couldn't unlink FIFO $fifo: $!\n"; print "$$ unlinked $fifo\n"; } }
To minimize the test case, I've set the targets to be 10.100.19.84 through 10.100.19.87, where .85 and .86 actually exist. The tcpdump output below shows the attempts to ping, along with successful replies from .85 and .86, but the script reports no response from anything.
09:33:29.275634 arp who-has 10.100.19.84 tell 10.100.19.1 09:33:29.305634 10.100.19.1 > 10.100.19.85: icmp: echo request (DF) 09:33:29.305634 10.100.19.85 > 10.100.19.1: icmp: echo reply 09:33:29.355634 10.100.19.1 > 10.100.19.86: icmp: echo request (DF) 09:33:29.355634 10.100.19.86 > 10.100.19.1: icmp: echo reply 09:33:29.395634 arp who-has 10.100.19.87 tell 10.100.19.1 09:33:30.275634 arp who-has 10.100.19.84 tell 10.100.19.1 09:33:30.395634 arp who-has 10.100.19.87 tell 10.100.19.1 09:33:31.275634 arp who-has 10.100.19.84 tell 10.100.19.1 09:33:31.395634 arp who-has 10.100.19.87 tell 10.100.19.1
If I increase the usleep delay from 25_000 us to 10_000_000 us, I'm essentially running it like a loop, and I get the replies for .85 and .86:
script output: Child 10.100.19.84 started 10.100.19.84: No response from 10.100.19.84 Child 10.100.19.84 exiting Child 10.100.19.85 started Found 10.100.19.85! Child 10.100.19.86 started Found 10.100.19.86! Child 10.100.19.87 started 10.100.19.87: No response from 10.100.19.87 Child 10.100.19.87 exiting Parent opening FIFO Parent looping over remaining kids Parent looping wait Parent reaping 10.100.19.87 Parent reading fifo Parent reaping 10.100.19.84 Parent reading fifo Locking FIFO appending to FIFO Child 10.100.19.85 exiting Parent reaping 10.100.19.85 Parent reading fifo Parent got 10.100.19.85! Locking FIFO appending to FIFO Child 10.100.19.86 exiting Parent reaping 10.100.19.86 Parent reading fifo Parent got 10.100.19.86! Parent closing fifo PONG: 10.100.19.85 PONG: 10.100.19.86 28674 unlinked /tmp/fileMVZqps tcpdump: 10:27:38.255634 arp who-has 10.100.19.84 tell 10.100.19.1 10:27:39.255634 arp who-has 10.100.19.84 tell 10.100.19.1 10:27:40.255634 arp who-has 10.100.19.84 tell 10.100.19.1 10:27:48.255634 10.100.19.1 > 10.100.19.85: icmp: echo request (DF) 10:27:48.255634 10.100.19.85 > 10.100.19.1: icmp: echo reply 10:27:51.665634 arp who-has 10.100.19.86 tell 10.100.19.1 10:27:51.665634 arp reply 10.100.19.86 is-at 0:0:50:b:b3:7f 10:27:53.255634 arp who-has 10.100.19.85 tell 10.100.19.1 10:27:53.255634 arp reply 10.100.19.85 is-at 0:0:50:b:b3:7f 10:27:58.275634 10.100.19.1 > 10.100.19.86: icmp: echo request (DF) 10:27:58.275634 10.100.19.86 > 10.100.19.1: icmp: echo reply 10:28:02.375634 arp who-has 10.100.19.1 tell 10.100.19.85 10:28:02.375634 arp reply 10.100.19.1 is-at 0:7:e9:9:8a:dd 10:28:08.295634 arp who-has 10.100.19.87 tell 10.100.19.1 10:28:09.295634 arp who-has 10.100.19.87 tell 10.100.19.1 10:28:10.295634 arp who-has 10.100.19.87 tell 10.100.19.1
My current hypothesis is that Net::Ping isn't smart enough to leave alone ICMP replies that don't belong to the current process, and that perhaps the other children are grabbing and discarding the replies that should have made it to the children pinging .85 and .86. Any thoughts?

--isotope

Replies are listed 'Best First'.
Re: Does Net::Ping have concurrency issues?
by ehdonhon (Curate) on May 30, 2003 at 20:07 UTC

    Suggestions:

    • Use Parallel::ForkManager
    • splice off a segment of @targets for each child instead of only doing one target per child. It will be more efficient due to the overhead of forking processes.
      We had a similar issue where we needed to run through about 40 thousand IP addresses to make sure folks weren't using IP's not assigned to them....Using Parallel::ForkManager made this much simpler.

      For more detail on the method I used to setup the mySQL connection check out Re: Re: Re: Secure ways to use DBI?

      The code below pulls the IP's from a database and runs through them 50 at a time...it runs quite well on a Ultra-Sparc 60

      #!/usr/bin/perl use Net::Ping; use DBI (); use Data_config; use Parallel::ForkManager; my $MAX_PROCESSES = 50; ## Create a database handle ## ## The actual user/pass info is in Data_config.pl ## my $DSN = "DBI:$DBDRIVER:database=$DATABASE:host=$DBHOST:port=$DBPORT" +; my $DBH = DBI->connect($DSN, $USERNAME, $PASSWORD, { RaiseError => 1, PrintError => 1 }); $|=1; my $PING_TIMEOUT = 2; $pm = new Parallel::ForkManager($MAX_PROCESSES); foreach my $IP (map { $_->[0] } @{ $DBH->selectall_arrayref( "SELECT ip_address FROM ips" )}) { # Forks and returns the pid for the child: my $pid = $pm->start and next; my $ping = new Net::Ping ("icmp"); if ($ping->ping($IP, $PING_TIMEOUT)) { print "$IP Gotcha! \n"; } else { print "$IP \n"; } $ping->close(); $pm->finish; # Terminates the child process } $pm->wait_all_children;
Re: Does Net::Ping have concurrency issues?
by isotope (Deacon) on May 30, 2003 at 18:05 UTC
    Ok, duh, moved Net::Ping->new() to within the child block so each child instantiates its own $pinger. Now it works. I guess Net::Ping tracks the responses by PID.

    Update: Ok, brain fart... You're right, Thelonius, it's using one socket per instantiation instead of creating a new one for each ping. I'll blame my allergy attack.

    --isotope
      I guess Net::Ping tracks the responses by PID.
      No, not exactly. Net::Ping opens a socket, which is like a file descriptor. The socket is bound to a given port in the chosen protocol. When you fork, the socket (like any file descriptor) is shared by the children. When a ping response comes back, the operating system network driver looks at the port number in the packet to determine where to put the data. All your processes are trying to read the same port, but only one is going to get it.
Re: Does Net::Ping have concurrency issues?
by hardburn (Abbot) on May 30, 2003 at 18:05 UTC

    Are you planning on pinging an entire /24? That's an awful lot of forking. Try perl 5.8.0 threads instead, which (hopefuly) are a little more scalable than forking.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: Does Net::Ping have concurrency issues?
by rob_au (Abbot) on May 30, 2003 at 23:04 UTC
    I think you may find the node Time-Slice Concurrent Ping in which I posted some code that can be used to ping multiple hosts concurrently of interest ... with no threads, no forking and no external binaries.

     

    perl -le 'print+unpack"N",pack"B32","00000000000000000000001001100011"'

Re: Does Net::Ping have concurrency issues?
by Aristotle (Chancellor) on May 30, 2003 at 20:17 UTC
    Without looking too far into your post, I think you should probably give fping a whirl instead of rolling your own.

    Makeshifts last the longest.

      fping is S-L-O-W... this is the best run I had:
      $ time /usr/local/sbin/fping -r1 -g 10.100.19.1/24 -a 2> /dev/null 10.100.19.1 10.100.19.50 10.100.19.85 10.100.19.87 10.100.19.200 real 0m18.816s user 0m0.010s sys 0m0.000s
      Maybe I did something wrong, but I could only get it down to one retry, and the default sure isn't zero, despite the --help claim.

      I tried Time-Slice Concurrent Ping with the modifications I posted in that thread and tuning the parameters (allow 254 outstanding pings, 1 second timeout), and this is what I get:
      $ sudo time ./multiping.pl Reply time for 10.100.19.1 - 3.007 seconds Reply time for 10.100.19.50 - 3.000 seconds Reply time for 10.100.19.85 - 2.994 seconds Reply time for 10.100.19.87 - 2.993 seconds Reply time for 10.100.19.200 - 2.971 seconds PONG: 10.100.19.1 PONG: 10.100.19.50 PONG: 10.100.19.85 PONG: 10.100.19.87 PONG: 10.100.19.200 0.15user 0.02system 0:04.11elapsed 4%CPU (0avgtext+0avgdata 0maxreside +nt)k 0inputs+0outputs (394major+191minor)pagefaults 0swaps
      With this one, it's always less than 4.25 seconds, which is much more useful to me.

      --isotope

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://261902]
Approved by myocom
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2024-04-19 06:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found