Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Dangers of decode_json? or a hash entry is both defined and undefined. WTF?

by varneraa (Acolyte)
on Jun 02, 2020 at 04:07 UTC ( [id://11117581]=perlquestion: print w/replies, xml ) Need Help??

varneraa has asked for the wisdom of the Perl Monks concerning the following question:

This new DB project just isn't being nice to me. Part of a new DB that I am putting together has a normalization table for services that we have organized into pools of machines.

I was attempting to uniquify an array of strings and I noticed that my input array of 30k elements became an output of one undef entry. I figured uniq from List::MoreUtils, must just have a bug and I would just have to move on with something myself, even though it works on 10 other tables I'm uniquifying.

The server names are standard fqdn that are alphanumeric with period separators.
my @output; my $pk = "SomeString"; foreach my $pool ( @pools ) { my @machines = qx/Some command that gets me a list of machines/; #Some code here to check for cmd errors, etc. Convert output to an +array of hashes (decode_json is involved); my $machine_aref = decode_json $machines[0]; foreach my $row ( @{$machines_aref} ) { push @output, $row->{$pk}; } } #At this point I can see the full list of machines, with no issues, or + empty lines @output = uniq @output; #At this point @output is empty return \@output;

So I decided to just uniquify the list myself, it's only 30k entries, right? So I put together this, I know it's inefficient, but I figured it out later

Same loop to generation to generate the arrays. Still seeing all value +s in the array before this: my @final_output; foreach my $line ( @output ) { next if ( grep m/^$line$/ @final_output ); push @final_output, $line; } <\code> <br> This time I don't see 0 results, the code just hangs. At first I thoug +ht, wow, is it really that inefficient? So I decide I can do better.< +br> <br> From a little reading about uniq I find out it essentially just slams +the array values into a hash and then pulls the keys. I don't care ab +out ordering, so why not just roll that myself?<br> <br> <code> my %output_hash; my $pk = "SomeString"; foreach my $pool ( @pools ) { my @machines = qx/Some command that gets me a list of machines/; #Some code here to check for cmd errors, etc. Convert output to an +array of hashes (decode_json is involved); my $machine_aref = decode_json $machines[0]; foreach my $row ( @{$machines_aref} ) { $output_hash{$row->{$pk}} = 1; } } my @output_array = keys %output_hash; return \@output_array;

Guess what? It fails. "Use of uninitialized value in concatenation (.) or string at...<The line that assigns the $output_hash{$row->{$pk}} = 1;>"

Ok, great, there is an undefined value in a row from $machine_aref, right? Yes, but no, but yes?

A little more debug code later and I have this...

my %output_hash; my $pk = "SomeString"; foreach my $pool ( @pools ) { my @machines = qx/Some command that gets me a list of machines/; #Some code here to check for cmd errors, etc. Convert output to an +array of hashes (decode_json is involved); my $machine_aref = decode_json $machines[0]; print "$#{machine_aref} lines returned\n"; foreach my $row ( @{$machines_aref} ) { unless ( defined $row->{$pk} ) { print "'$pk'\n"; print Dumper $row; print "$pk, $row\n"; print "Bad return: Some command from above\n"; print "HERE: " . $row->{"$pk"} . "\n"; die; } $output_hash{$row->{$pk}} = 1; } } my @output_array = keys %output_hash; return \@output_array;

Surely we're going to hit the unless block and everything will be undef, right? Yes and no and yes again...

<PoolName>: 4737 lines returned. 'SomeString' $VAR1 = { 'SomeString' => '<machinename>.<location>.<something>.com', 'Count' => 6 }; SomeString, HASH(0x106d978) Bad command return: <Some command used to get this data> Use of uninitialized value in concatenation (.) or string at <script_p +ath> line <Line with print "HERE: " .$row->{"$pk"} . "\n";>.

So I see that $row->{$pk} is undefined for this row, but I can dump the row and see that $pk matches exactly, remember this works for every other entry in the array, and then it fails again when I try to print the value. WTF?

After this, I decided it was worth seeing which value(s) caused the problem. Turns out it was the first one, and the second, and wait, I guess all of them... So I switched the output of the command I was using to one that would just give me comma-delimited values and now everything works, including uniq. I still wish I understood what went wrong. If anyone is seeing a similar issue, I'm sorry this is a dead end. If someone with more expertise than me understands what is going on, I'd love you feedback. As always, thanks!
Some final details: Perl 5.30.2 built using perlbrew on SLES11 - Linux 3.0.101-108.87-default

Replies are listed 'Best First'.
Re: Dangers of decode_json? or a hash entry is both defined and undefined. WTF?
by Haarg (Priest) on Jun 02, 2020 at 08:10 UTC
    I'd recommend setting $Data::Dumper::Useqq = 1, which will ensure any control characters are visible in the Dumper output.
Re: Dangers of decode_json? or a hash entry is both defined and undefined. WTF?
by perlfan (Vicar) on Jun 02, 2020 at 04:46 UTC
    I can't help but noticing in your code examples the lack of:
    use strict; use warnings;
    I'd like to assume, but I've wasted hours (and been to WTF!! and back), before only to realize I didn't add those 2 lines.
Re: Dangers of decode_json? or a hash entry is both defined and undefined. WTF?
by haukex (Archbishop) on Jun 02, 2020 at 21:49 UTC

    Unfortunately, I can't really help here because I don't think what you showed here is very representative, and there are a couple of inconsistencies: for example, you say "my input array of 30k elements became an output of one undef entry" but later "At this point @output is empty ... This time I don't see 0 results", or your last piece of example code shows prints of strings that don't apppear in exactly the same way in the example output, which means this isn't exactly the code you ran. Please see Short, Self-Contained, Correct Example - something that we can run ourselves will be much more helpful to us and therefore to you. Also, you never show us what $machine_aref actually looks like (Data::Dump or Data::Dumper with $Data::Dumper::Useqq=1;), but that data structure is really central to your question.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11117581]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-04-25 16:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found