Help me understand hashes

hallikpapa has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks. Back with a question again. I have loaded a bunch of data into a hash like so in a loop:

$hash{$station}{$row}{$id} = $draw->{Value};
[download]

So all data from a file is in that multidimensional hash. What I want to do is compare different data from different $station(s). STATION3 will have a many relationship to one STATION2, and STATION4 will have a many relationship to one STATION3, etc... For instance I drew this on my scratch pad, which doesn't work right, but might help explain what I am looking for:

while ($hash{"STATION2"}{$row}{"ID1"} eq $hash{"STATION3"}{$row}{"ID1"
+) {
     ........
}
[download]

Example: I want to compare every ID value at STATION3 to see if it matches the ID value at STATION2, if it matches, do something... So if I have all data from a huge file organized nicely in this hash, how would I jump through this hash efficiently? Not asking for a handout here, just always get my butt kicked trying to look at hashes and once in a while regex(s), but the ilovejackdaniels cheat sheet helps with that one. :)

Comment on Help me understand hashes Select or Download Code

Replies are listed 'Best First'.
Re: Help me understand hashes by graff (Chancellor) on Nov 16, 2007 at 01:41 UTC
What is the importance of $row in the hash structure? If $id is more important than $row, then maybe your structure should be "$hash{$station}{$id}{$row}"... Anyway, as it is, you would want to get the ids from all rows of "STATION2", and the ids from all rows of STATION3, and do something for the ones that match. Does it matter whether or not the $row values match too? If not, then you will probably be better off re-ordering the hash layers as suggested in the previous paragraph. And what sort of "efficiency" are you looking for: run-time speed, or memory footprint? If the structure you showed is essential for some other aspect of your program, but you also want to do this thing with matching ID's, you can either stick with just the one copy of the data in memory and do a lot of extra processing (to navigate through all the rows), or make additional copies of the data (making it easier to work with all the ID's) and do less processing. Let's suppose there's a good reason to keep the existing structure as-is, and that it's okay to just make a copy of the data so that you can work with the ids. Here's a way to "transpose" the hash layers: `my %rehash; for my $station ( keys %hash ) { for my $row ( keys %{$hash{$station}} ) { for my $id ( keys %{$hash{$station}{$row}} ) { $rehash{$station}{$id}{$row} = $hash{$station}{$row}{$id}; } } } # now check for matching ids among station2 and station3 for my $s2_key ( keys %{$rehash{"STATION2"}} ) { if ( exists( $rehash{"STATION3"}{$s2_key} )) { # do something... } }` [download]	[reply] [d/l]
Re: Help me understand hashes by GrandFather (Saint) on Nov 16, 2007 at 02:24 UTC
Often if you can't figure out how to drive a data structure in a clean way to achieve some end, it is because the data structure is not appropriate. `$row` implies a contiguously numbered sequence starting at 0 or 1 to me. If that is the case then an array is much more appropriate than a hash. In any case a code fragment with no data nor any explanation of what you are actually trying to achieve won't get you very good answers. Try presenting a small sample script with some sample data and telling us about the larger problem you are trying to solve. Perl is environmentally friendly - it saves trees	[reply] [d/l]
Re^2: Help me understand hashes by hallikpapa (Scribe) on Nov 16, 2007 at 17:54 UTC
Yeah I suppose it's easier when there is a visual So I have a spreadsheet that has 4 tabs. STATION1 thru STATION4. This first column in each tab is the rowId , just incremental row counter. data on STATION1 would look like this: `1,,business,first,last,email,,,,,` [download] STATION2(Column2 is a reference to Column1 in STATION1): `1,1,,,123,Main,St,City,State,Zip` [download] STATION3 (Column2 references Column1 in STATION2): `1,1,1,P-550,11223344 2,1,2,P-330,22334455` [download] I am inserting data into two hashes like so: `$excel{$tab}{$row}{$col} = $cell->{Val}; $rehash{$tab}{$id}{$row} = $excel{$tab}{$row}{$col};` [download] This attempt is finding no results: `for my $s2_key (keys %{$rehash{"1"}}) { if ( exists( $rehash{"1"}{$s2_key} )) { print SOMETHING; }` [download] Update: Durrr! That's what I get for operating on no sleep. Put "1" instead of "STATION2". I will continue to fiddle.	[reply] [d/l] [select]
Re: Help me understand hashes by ysth (Canon) on Nov 16, 2007 at 01:53 UTC
I want to compare every ID value at STATION3 to see if it matches the ID value at STATION2, if it matches, do something That question makes sense if there were no {$row} in the middle; as is it is very unclear what you mean. How about some sample data, showing what is and isn't a match?	[reply]
Re: Help me understand hashes by planetscape (Chancellor) on Nov 17, 2007 at 10:53 UTC
Some other alternatives for visualizing complex data structures are described here: Re: How can I visualize my complex data structure?, including some graphical options for the more visually inclined. :-) HTH, planetscape	[reply]
Re: Help me understand hashes by injunjoel (Priest) on Nov 17, 2007 at 01:18 UTC
Just a quick note since you seem more than motivated to learn on your own. Try using either Data::Dumper or Dumpvalue to visualize your data structures. Once you see what is going on under the hood you can better construct ways to access it for your specific needs. Then read perldsc until it makes sense. Both of the above modules are core so there is nothing to download, and they are simple to use. -InjunJoel "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo	[reply]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks