spickles has asked for the wisdom of the Perl Monks concerning the following question:
I have written some code to read in a CSV file and then store various characteristics about network information at each building into a hashed array. I then print it back out to another CSV file, and I'd like to be able to sort the information by building name. However, I'm not there yet as my current code only prints 11 of the 40 total entries in the table. My code is cleaned up for now, but I have previously verified by printing out the hash that it indeed contains all 40 records prior to printing to my output file.
#!c:\perl\bin\perl
use strict;
use warnings;
my %buildings;
my $hashref = \%buildings;
my $ref;
my $vlan_number;
my $first_octet;
my $second_octet;
my $third_octet;
my $fourth_octet;
my $subnet_slash;
my $subnet_dotted;
my $network;
my $bldg_name;
my $description;
my $infile = "e:\\IP Addressing Spreadsheet.csv";
open MYFILE,"<$infile" or die "Could not open CSV file! : $!";
+
foreach my $line (<MYFILE>) {
chomp($line);
my @temp = split(/,/,$line);
$vlan_number = $temp[0];
$first_octet = $temp[1];
$second_octet = $temp[2];
$third_octet = $temp[3];
$fourth_octet = $temp[4];
$subnet_slash = $temp[5];
$subnet_dotted = $temp[6];
$network = $temp[7];
$bldg_name = $temp[8];
$description = $temp[9];
@temp = ();
$buildings{$bldg_name}{'name'} = $bldg_name;
$buildings{$bldg_name}{'number'} = $vlan_number;
$buildings{$bldg_name}{'first_octet'} = $first_octet;
$buildings{$bldg_name}{'second_octet'} = $second_octet;
$buildings{$bldg_name}{'third_octet'} = $third_octet;
$buildings{$bldg_name}{'fourth_octet'} = $fourth_octet;
$buildings{$bldg_name}{'subnet_slash'} = $subnet_slash;
$buildings{$bldg_name}{'subnet_dotted'} = $subnet_dotted;
$buildings{$bldg_name}{'network'} = $network;
$buildings{$bldg_name}{'description'} = $description;
}
close MYFILE;
my $outfile = "e:\\BuildingVLANS.csv";
unlink $outfile;
open (OUTFILE,">$outfile") or die "Could not open CSV file! : $!";
print_hash($hashref);
close OUTFILE;
sub print_hash {
$ref = shift;
for my $entry (sort keys %buildings) {
print OUTFILE "Building name,$ref->{$entry}{'name'}\n";
print OUTFILE "VLAN number,$ref->{$entry}{'number'}\n";
print OUTFILE "VLAN 1st Octet,$ref->{$entry}{'first_octet'}\n";
print OUTFILE "VLAN 2nd Octet,$ref->{$entry}{'second_octet'}\n";
print OUTFILE "VLAN 3rd Octet,$ref->{$entry}{'third_octet'}\n";
print OUTFILE "VLAN 4th Octet,$ref->{$entry}{'fourth_octet'}\n";
print OUTFILE "VLAN Subnet Slash,$ref->{$entry}{'subnet_slash'}\n"
+;
print OUTFILE "VLAN Subnet Dotted,$ref->{$entry}{'subnet_dotted'}\
+n";
print OUTFILE "VLAN Network,$ref->{$entry}{'network'}\n";
print OUTFILE "VLAN Description,$ref->{$entry}{'description'}\n";
print OUTFILE "\n";
}
}
__END__
Re: Printing of Array Hash is Missing Elements
by ELISHEVA (Prior) on Sep 28, 2009 at 20:11 UTC
|
You say you have verified that the hash contains all 40 elements. How? Without seeing that code it is going to be hard for us to guess why print_hash() only prints out 11 elements, but your mystery method prints out 40.
That aside, you might want to consider the following changes:
- slurping: i.e. @lines = <INFILE>. This doesn't scale well because you have to hold the entire file into memory. A better approach is to read one line in at a time:
while (my $line=<MYFILE>) {
chomp $line;
#set up hash entry
}
- two parameter open: there are some rare circumstances where this can lead to problems. Consider getting in the habit of using the 3 parameter open: open(MYFILE, '<', $infile)
- clearing @temp by setting @temp=() is unnecessary. split is going to replace the contents of @temp anyway.
- You don't actually need @temp. Perl has a wonderful feature that lets you assign array elements to a list of named variables: my ($vlan_number, $first_octet, $second_octet, $third_octet, $fourth_octet, $subnet_slash, $subnet_dotted, $network, $bldg_name, $description) = split(/,/, $line);
- Unless you are absolutely sure that there will never be whitespace on either side of your comma, split(/\s*,\s*/, $line) is a more robust way of splitting a comma delimited list.
- Assigning $hashref=\%buildings at the top, filling %buildings, and then printing $hashref seems a bit convoluted. You can assign data directly to a hash reference by using -> operator, so why not just do this:
my $buildings={}; #set up a hash reference
while (my $line = <MYFILE>) {
#... parse line
$buildings->{$bldg_name}{'name'} = $bldg_name;
$buildings->{$bldg_name}{'number'} = $vlan_number;
#... and so on
}
# ... more stuff
print_hash($buildings);
Best, beth
| [reply] [d/l] [select] |
|
Et al -
Thanks for the many suggestions. It's my limited knowledge of perl that results in my current code. I appreciate the overwhelming support from those that do this on a more regular basis. I think that the suggestions I have gotten here will allow me to make my code smaller, more reliable, and easier to understand. It has been particularly the use of multidimensional arrays, hashes and references that has gotten me confused here. I'll clean things up and post back. Thanks again!!
| [reply] |
Re: Printing of Array Hash is Missing Elements
by GrandFather (Saint) on Sep 28, 2009 at 19:56 UTC
|
How about a little data? Not all the data, just enough to demonstrate the problem. You should also show us a sample of what you get and what you expect for the sample data.
At present I can't tell if there should be data for 40 buildings, but you are only seeing output data for 11, or if the problem is that there are 40 columns in the input and only 11 in the output. Although you describe your output file as CSV, it's not using the same format as the original file and doesn't conform to the usual expectations for a CSV file - it may be better thought of as a text or data file to avoid confusion.
As an aside, don't globally declare local variables - all your temporary variables should be declared where they are initialized (inside the loop or sub). Use the three parameter version of open and use a lexical variable (my $myFile) instead MYFILE (good to see the open is checked though).
Also, use a while loop rather than a for loop for reading input lines.
True laziness is hard work
| [reply] |
|
Grandfather -
Ok, so I modified some of the data file, as well as the suggestions already posted. It is clear, now, after modifying the building names that the output is only containing unique records. The input is this:
VLAN Tag 1st Octet 2nd Octet 3rd Octet 4th Octet Mask
+ Mask Subnet Building Description
321 10 32 32 0 19 255.255.224.0 10.32.32.0 /19
+ Building1 Data VLAN
322 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding1 Data VLAN
323 10 32 128 0 19 255.255.224.0 10.32.128.0 /19
+ Building1 Data VLAN
324 10 32 96 0 19 255.255.224.0 10.32.96.0 /19
+ Building1 Data VLAN
325 10 32 160 0 19 255.255.224.0 10.32.160.0 /19
+ Building1 Data VLAN
326 10 32 192 0 19 255.255.224.0 10.32.192.0 /19
+ Building1 Data VLAN
327 10 32 64 0 19 255.255.224.0 10.32.64.0 /19
+ Building1 Data VLAN
328 10 32 248 0 22 255.255.252.0 10.32.248.0 /22
+ Building1 Data VLAN
329 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding1 Data VLAN
330 10 32 224 0 24 255.255.255.0 10.32.224.0 /24
+ Building1 Data VLAN
331 10 32 225 0 24 255.255.255.0 10.32.225.0 /24
+ Building2 Data VLAN
332 10 32 228 0 24 255.255.255.0 10.32.228.0 /24
+ Building2 Data VLAN
333 10 32 229 0 24 255.255.255.0 10.32.229.0 /24
+ Building2 Data VLAN
334 10 32 232 0 24 255.255.255.0 10.32.232.0 /24
+ Building2 Data VLAN
335 10 32 233 0 24 255.255.255.0 10.32.233.0 /24
+ Building2 Data VLAN
336 10 32 236 0 24 255.255.255.0 10.32.236.0 /24
+ Building2 Data VLAN
337 10 32 237 0 24 255.255.255.0 10.32.237.0 /24
+ Building2 Data VLAN
338 10 32 240 0 24 255.255.255.0 10.32.240.0 /24
+ Building2 Data VLAN
339 10 32 241 0 24 255.255.255.0 10.32.241.0 /24
+ Building2 Data VLAN
340 10 32 244 0 24 255.255.255.0 10.32.244.0 /24
+ Building2 Data VLAN
341 10 32 245 0 24 255.255.255.0 10.32.245.0 /24
+ Building3 Data VLAN
342 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
343 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
344 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
345 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
346 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
347 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
348 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
349 10 32 0 24 255.255.255.0 10.32..0 /24 Bui
+lding3 Data VLAN
350 10 32 2 0 23 255.255.254.0 10.32.2.0 /23 B
+uilding3 Data VLAN
351 10 32 4 0 23 255.255.254.0 10.32.4.0 /23 B
+uilding4 Data VLAN
352 10 32 6 0 23 255.255.254.0 10.32.6.0 /23 B
+uilding4 Data VLAN
353 10 32 8 0 23 255.255.254.0 10.32.8.0 /23 B
+uilding4 Data VLAN
354 10 32 10 0 23 255.255.254.0 10.32.10.0 /23
+ Building4 Data VLAN
355 10 32 12 0 23 255.255.254.0 10.32.12.0 /23
+ Building4 Data VLAN
356 10 32 14 0 23 255.255.254.0 10.32.14.0 /23
+ Building4 Data VLAN
365 10 33 32 0 24 255.255.255.0 10.33.32.0 /24
+ Building4 Data VLAN
606 10 0 249 0 29 255.255.255.248 10.0.249.0 /29
+ Building4 Data VLAN
The output is:
Building name Building
VLAN number VLAN Tag
VLAN 1st Octet 1st Octet
VLAN 2nd Octet 2nd Octet
VLAN 3rd Octet 3rd Octet
VLAN 4th Octet 4th Octet
VLAN Subnet Slash Mask
VLAN Subnet Dotted Mask
VLAN Network Subnet
VLAN Description Description
Building name Building1
VLAN number 330
VLAN 1st Octet 10
VLAN 2nd Octet 32
VLAN 3rd Octet 224
VLAN 4th Octet 0
VLAN Subnet Slash 24
VLAN Subnet Dotted 255.255.255.0
VLAN Network 10.32.224.0 /24
VLAN Description Data VLAN
Building name Building2
VLAN number 340
VLAN 1st Octet 10
VLAN 2nd Octet 32
VLAN 3rd Octet 244
VLAN 4th Octet 0
VLAN Subnet Slash 24
VLAN Subnet Dotted 255.255.255.0
VLAN Network 10.32.244.0 /24
VLAN Description Data VLAN
Building name Building3
VLAN number 350
VLAN 1st Octet 10
VLAN 2nd Octet 32
VLAN 3rd Octet 2
VLAN 4th Octet 0
VLAN Subnet Slash 23
VLAN Subnet Dotted 255.255.254.0
VLAN Network 10.32.2.0 /23
VLAN Description Data VLAN
Building name Building4
VLAN number 606
VLAN 1st Octet 10
VLAN 2nd Octet 0
VLAN 3rd Octet 249
VLAN 4th Octet 0
VLAN Subnet Slash 29
VLAN Subnet Dotted 255.255.255.248
VLAN Network 10.0.249.0 /29
VLAN Description Data VLAN
| [reply] [d/l] [select] |
|
Because you are keying your hash by building name, you only get 4 hash entries, one for each unique building name. Each time you add a new line with the same building name, it overwrites the last line you inserted for that building entry. Hence you read in 40 lines but only get out 4.
If you want to keep each row separate, I would recommend storing your parsed records in an array of hashes and using a customized sort function.
Instead of $buildings{bldg_name}{first_octet} do something like this:
my @aBuildings;
while (my $line=<MYFILE>) {
chomp $line;
#... parse line
# add a new element to your array of hashes (AoH)
my $hLine = {
name => $bldg_name
number => $vlan_number
...
};
push @aBuildings, $hLine;
}
Then in your print out you would use a custom sort to sort the array elements by building name:
foreach my $hLine (sort { $a->{name} cmp $b->{name} } @aBuildings) {
#print out contents of $hLine
}
See sort and perldsc for more information about custom sorting routines and AoH (array of hash) data structures.
Best, beth | [reply] [d/l] [select] |
|
Your code splits on commas but your sample data has no commas.
To produce all the networks in the format you have shown, sorted by building and VLAN, I might do something like the following:
use strict;
use warnings;
use Data::Dumper;
do {
print <<EOF;
Building name $_->[8]
VLAN number $_->[0]
VLAN 1st Octet $_->[1]
VLAN 2nd Octet $_->[2]
VLAN 3rd Octet $_->[3]
VLAN 4th Octet $_->[4]
VLAN Subnet Slash $_->[5]
VLAN Subnet Dotted $_->[6]
VLAN Network $_->[7]
VLAN Description $_->[9]
EOF
} for sort { $a->[8] cmp $b->[8] or $a->[0] <=> $b->[0] }
map { [ unpack('(A3 x1)6 (A15 x1)2 A11 x1 A*') ] } <DATA>;
__DATA__
321 10 32 32 0 19 255.255.224.0 10.32.32.0 /19 Building1 Da
+ta VLAN
322 10 32 0 24 255.255.255.0 10.32..0 /24 Building1 Da
+ta VLAN
323 10 32 128 0 19 255.255.224.0 10.32.128.0 /19 Building1 Da
+ta VLAN
324 10 32 96 0 19 255.255.224.0 10.32.96.0 /19 Building1 Da
+ta VLAN
325 10 32 160 0 19 255.255.224.0 10.32.160.0 /19 Building1 Da
+ta VLAN
326 10 32 192 0 19 255.255.224.0 10.32.192.0 /19 Building1 Da
+ta VLAN
327 10 32 64 0 19 255.255.224.0 10.32.64.0 /19 Building1 Da
+ta VLAN
328 10 32 248 0 22 255.255.252.0 10.32.248.0 /22 Building1 Da
+ta VLAN
329 10 32 0 24 255.255.255.0 10.32..0 /24 Building1 Da
+ta VLAN
330 10 32 224 0 24 255.255.255.0 10.32.224.0 /24 Building1 Da
+ta VLAN
331 10 32 225 0 24 255.255.255.0 10.32.225.0 /24 Building2 Da
+ta VLAN
332 10 32 228 0 24 255.255.255.0 10.32.228.0 /24 Building2 Da
+ta VLAN
333 10 32 229 0 24 255.255.255.0 10.32.229.0 /24 Building2 Da
+ta VLAN
334 10 32 232 0 24 255.255.255.0 10.32.232.0 /24 Building2 Da
+ta VLAN
335 10 32 233 0 24 255.255.255.0 10.32.233.0 /24 Building2 Da
+ta VLAN
336 10 32 236 0 24 255.255.255.0 10.32.236.0 /24 Building2 Da
+ta VLAN
337 10 32 237 0 24 255.255.255.0 10.32.237.0 /24 Building2 Da
+ta VLAN
338 10 32 240 0 24 255.255.255.0 10.32.240.0 /24 Building2 Da
+ta VLAN
339 10 32 241 0 24 255.255.255.0 10.32.241.0 /24 Building2 Da
+ta VLAN
340 10 32 244 0 24 255.255.255.0 10.32.244.0 /24 Building2 Da
+ta VLAN
341 10 32 245 0 24 255.255.255.0 10.32.245.0 /24 Building3 Da
+ta VLAN
342 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
343 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
344 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
345 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
346 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
347 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
348 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
349 10 32 0 24 255.255.255.0 10.32..0 /24 Building3 Da
+ta VLAN
350 10 32 2 0 23 255.255.254.0 10.32.2.0 /23 Building3 Da
+ta VLAN
351 10 32 4 0 23 255.255.254.0 10.32.4.0 /23 Building4 Da
+ta VLAN
352 10 32 6 0 23 255.255.254.0 10.32.6.0 /23 Building4 Da
+ta VLAN
353 10 32 8 0 23 255.255.254.0 10.32.8.0 /23 Building4 Da
+ta VLAN
354 10 32 10 0 23 255.255.254.0 10.32.10.0 /23 Building4 Da
+ta VLAN
355 10 32 12 0 23 255.255.254.0 10.32.12.0 /23 Building4 Da
+ta VLAN
356 10 32 14 0 23 255.255.254.0 10.32.14.0 /23 Building4 Da
+ta VLAN
365 10 33 32 0 24 255.255.255.0 10.33.32.0 /24 Building4 Da
+ta VLAN
606 10 0 249 0 29 255.255.255.248 10.0.249.0 /29 Building4 Da
+ta VLAN
But the sorting appears to be unnecessary as the data you have provided is already sorted. | [reply] [d/l] |
Re: Printing of Array Hash is Missing Elements
by toolic (Bishop) on Sep 28, 2009 at 19:44 UTC
|
for my $entry (sort keys %buildings) {
Try:
for my $entry (sort keys %{ $ref }) {
I would also do:
print_hash(\%buildings);
| [reply] [d/l] [select] |
Re: Printing of Array Hash is Missing Elements
by jakobi (Pilgrim) on Sep 28, 2009 at 20:18 UTC
|
If you applied the earlier tips, and the bug's not yet obvious, consider warn's and the perl debugger:
Beside the time honored practice of liberally sprinkling warns() to check for "assertions" like @keys=sort keys %buildings; $#keys+1==40 or warn "NOT forty: $#keys+1\n" you shouldn't forget the possibilities offered by the perl debugger, esp. when you set the breakpoint near suspected code.
Updated to include Dumper: also consider use Data::Dumper; ...; warn Dumper(...); I actually prefer this over interactive debugger use.
| [reply] [d/l] [select] |
|
|