Currently you're slurping the entire data set into memory at once; not only that, you're copying possibly huge chunks of it several more times. If you can build your code around a while() loop, and process each line at a time instead of slurping the entire file, you'd be much better off, memory-wise.
# instead of this:
my @lines = <DATA>;
# do something like this:
while my $line (<DATA>) {
...
}
Even if you only build @matches in that loop and keep the rest of the code the same, you may be much better off (assuming you have few matches compared to the size of the dataset). Deleting arrays after you're done with them (use my and arrange the code so they go out of lexical scope) will also help with memory reuse.
If you can more clearly explain what this code is supposed to do, we might be able to find a much more straightforward solution. As it is, the code seems to be doing the same thing over again several times in different ways before printing its final results.
Alan
| [reply] [d/l] [select] |
Just to further explain what I am tryint to achieve. We currently manage acouple hundred subnets at my job. Each Monday morning, we get ARP cache dumps from all of our routers sent to us. People send us requests for IP addresses and DNS names. There are network admins that are notorious for not returning IP addresses.
What I am trying to do, is take the last few months worth of ARP information (that is the tail -250000... command. It's jsut an approximation). We use that to determine which IPs have had no activity for a while and we remove the allocation from our records and notify the admin that we had it assigned to that we have reclaimed the address.
Thanks for the input!
perl -e 'print reverse qw/o b n a e s/;'
| [reply] |
If you're concerned about memory usage, you shouldn't read the entire file at once. Read it line by line and count the IPs in a hash, so you don't get duplicated entries. And, of course, use strict, but I guess you knew that already :) Anyhow, here's a snipped that came to my mind:
#!/usr/bin/perl -w
use strict;
my $subnet = '192.168.87';
my %data = ();
# precompile regex for performance..
my $regex = qr#^($subnet\.\d+)#;
# read file line by line
while (my $line = <DATA>)
{
chomp($line);
if ($line =~ $regex)
{
$data{$1}++;
}
elsif ($debug)
{
print STDERR "Didn't match: $line\n";
}
}
If you really do want to stick to your own code, your line is better written as (it's not very nice either..):
my @matches = grep { m/$regex/ } @lines;
($_) = m/$regex/ foreach @matches;
Regards,
-octo- | [reply] [d/l] [select] |
I'll give this a try. I was trying to be elegant and do things faster than brute forcing my way line by line, but I guess you see where that got me... :-(
Yea, I know I really should have been using strict and i felt like an idiot posting the code without it (thus the warning up top). Thanks for your input!
when I clean up the code (and am using strict like I should be), I will repost the code.
perl -e 'print reverse qw/o b n a e s/;'
| [reply] |
OK, here is the updated code. I modified it to read from a file instead of <DATA>. And guess what?!?! It runs under strict!! (Oh, it works too)
++ to ferrency and flocto for their help!!
#!/usr/bin/perl -w
use strict;
use vars qw/ $opt_s /;
use Getopt::Std;
use Net::Netmask;
my $subnet;
my %data = ();
my $ARP = '/tmp/arp.bak';
getopt('s');
if (defined($opt_s) && $opt_s ne '') {
$subnet = $opt_s;
} else {
help();
}
my $block = new Net::Netmask($subnet);
if (defined($block->{'ERROR'})) { die "Invalid subnet/mask combinati
+on."}
my @range = $block->enumerate();
#We need to get our temp file
system `tail -250000 /tmp/arp > $ARP`;
#Open the file and read it line by line
open(FH, $ARP) || die ("Couldn't open the arp file $ARP");
while (<FH>)
{
if (/^((?:\d{1,3}\.){3}\d{1,3})/)
{
if ($block->match($1)) { $data{$1}++; }
}
}
close FH;
#I need to specifically remove the network, gateway and broadcast
#addresses since we don't care about those.
#First remove from enumeration array
shift @range; #Network Address
shift @range; #Gateway Address
pop @range; #Broadcast Address
#Now remove from the IP's found in the ARP cache
delete $data{$block->base()};
delete $data{$block->nth(1)};
delete $data{$block->broadcast()};
my @matches = keys %data;
#Compare the array of matched IPs to the enumerated Netblock
my @intersection = my @difference = ();
undef %data;
foreach my $element (@matches, @range) { $data{$element}++ }
foreach my $element (keys %data) {
push @{ $data{$element} > 1 ? \@intersection : \@difference }, $
+element;
}
#Now I'd like to sort the IPs (a little Schwartzian Transform action
+ here...)
my @sorted = map { join '.', unpack 'N*', $_ }
sort
map { pack 'N*', split /\./ }
@difference;
print "Addresses that are candidates for reclaim in:\n";
print $block->desc(), "\n\n";
print join("\n",@sorted), "\n";
sub help {
print <<'HELP';
You must supply a valid subnet.
Acceptable formats are as follows:
192.168.1.0/24 <--- The preferred form.
192.168.1.0:255.255.255.0
192.168.1.0-255.255.255.0
syntax: arpscan.pl -s 192.168.1.0/24
HELP
exit(1);
}
Update: Modified the IP sort to use a Schwartzian Transform so I can have them truly sorted like IP's should be.
Update: Added a little help, support for VLSM's, and stripped out un-needed addresses (network, gateway, and broadcast). **note - the gateway is specific to our organization, yours may use a different address, we use network + 1. Thanks to tye, belg4mit, and arturo for your help with my regex issue. /msg me if I left you out.
Update: Added code to check for invalid IP address/mask combination since some joker here already tried to enter something like 192.168.1.256/24.
Update: Fixed the regex (read as: removed a space that I would have never seen in a million years!) ++tye
perl -e 'print reverse qw/o b n a e s/;' | [reply] [d/l] |