http://qs321.pair.com?node_id=11117581

varneraa has asked for the wisdom of the Perl Monks concerning the following question:

This new DB project just isn't being nice to me. Part of a new DB that I am putting together has a normalization table for services that we have organized into pools of machines.

I was attempting to uniquify an array of strings and I noticed that my input array of 30k elements became an output of one undef entry. I figured uniq from List::MoreUtils, must just have a bug and I would just have to move on with something myself, even though it works on 10 other tables I'm uniquifying.

The server names are standard fqdn that are alphanumeric with period separators.
my @output; my $pk = "SomeString"; foreach my $pool ( @pools ) { my @machines = qx/Some command that gets me a list of machines/; #Some code here to check for cmd errors, etc. Convert output to an +array of hashes (decode_json is involved); my $machine_aref = decode_json $machines[0]; foreach my $row ( @{$machines_aref} ) { push @output, $row->{$pk}; } } #At this point I can see the full list of machines, with no issues, or + empty lines @output = uniq @output; #At this point @output is empty return \@output;

So I decided to just uniquify the list myself, it's only 30k entries, right? So I put together this, I know it's inefficient, but I figured it out later

Same loop to generation to generate the arrays. Still seeing all value +s in the array before this: my @final_output; foreach my $line ( @output ) { next if ( grep m/^$line$/ @final_output ); push @final_output, $line; } <\code> <br> This time I don't see 0 results, the code just hangs. At first I thoug +ht, wow, is it really that inefficient? So I decide I can do better.< +br> <br> From a little reading about uniq I find out it essentially just slams +the array values into a hash and then pulls the keys. I don't care ab +out ordering, so why not just roll that myself?<br> <br> <code> my %output_hash; my $pk = "SomeString"; foreach my $pool ( @pools ) { my @machines = qx/Some command that gets me a list of machines/; #Some code here to check for cmd errors, etc. Convert output to an +array of hashes (decode_json is involved); my $machine_aref = decode_json $machines[0]; foreach my $row ( @{$machines_aref} ) { $output_hash{$row->{$pk}} = 1; } } my @output_array = keys %output_hash; return \@output_array;

Guess what? It fails. "Use of uninitialized value in concatenation (.) or string at...<The line that assigns the $output_hash{$row->{$pk}} = 1;>"

Ok, great, there is an undefined value in a row from $machine_aref, right? Yes, but no, but yes?

A little more debug code later and I have this...

my %output_hash; my $pk = "SomeString"; foreach my $pool ( @pools ) { my @machines = qx/Some command that gets me a list of machines/; #Some code here to check for cmd errors, etc. Convert output to an +array of hashes (decode_json is involved); my $machine_aref = decode_json $machines[0]; print "$#{machine_aref} lines returned\n"; foreach my $row ( @{$machines_aref} ) { unless ( defined $row->{$pk} ) { print "'$pk'\n"; print Dumper $row; print "$pk, $row\n"; print "Bad return: Some command from above\n"; print "HERE: " . $row->{"$pk"} . "\n"; die; } $output_hash{$row->{$pk}} = 1; } } my @output_array = keys %output_hash; return \@output_array;

Surely we're going to hit the unless block and everything will be undef, right? Yes and no and yes again...

<PoolName>: 4737 lines returned. 'SomeString' $VAR1 = { 'SomeString' => '<machinename>.<location>.<something>.com', 'Count' => 6 }; SomeString, HASH(0x106d978) Bad command return: <Some command used to get this data> Use of uninitialized value in concatenation (.) or string at <script_p +ath> line <Line with print "HERE: " .$row->{"$pk"} . "\n";>.

So I see that $row->{$pk} is undefined for this row, but I can dump the row and see that $pk matches exactly, remember this works for every other entry in the array, and then it fails again when I try to print the value. WTF?

After this, I decided it was worth seeing which value(s) caused the problem. Turns out it was the first one, and the second, and wait, I guess all of them... So I switched the output of the command I was using to one that would just give me comma-delimited values and now everything works, including uniq. I still wish I understood what went wrong. If anyone is seeing a similar issue, I'm sorry this is a dead end. If someone with more expertise than me understands what is going on, I'd love you feedback. As always, thanks!
Some final details: Perl 5.30.2 built using perlbrew on SLES11 - Linux 3.0.101-108.87-default