Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Access Hashes of Hashes etc.

by muba (Priest)
on Sep 14, 2011 at 13:30 UTC ( [id://925905]=note: print w/replies, xml ) Need Help??


in reply to Access Hashes of Hashes etc.

Usually, when working with such data structures, you're only interested at one node at a time. And when that's the case I find it pretty useful to keep that node in its own variable, so you don't have to bother with the whole structure all the time.

For example, suppose I keep track of the way visistors use my web app from one day to another. To do this, I keep user statistics in an array, where each element represents a day. The last element in the list is yesterday, element -2 is the day before yesterday, et cetera.

Each element is itself a hashref, with at least the key 'username' given. Another key include 'visited_pages', which value is an AoH with information about the pages this particular user visited such as page title and an array of other users who also visited that page on that day.

Suppose I'm tasked to see what pages a certain user, named John Doe, had visited a week ago, and who else has visited the same pages on that day.

my $webapp_usage_statistics = pull_this_data_from_somewhere(); my $johndoe = grep {$_->{username eq "John Doe"}} @{$webapp_usage_stat +istics->[-7]} for my $page (@{$johndoe->{pages}}) { print "$johndoe->{He_or_She} visited $page->{title}\n"; my $others = join(", ", map {$_->{username}} @{$page->{others}}); print " as did $others.\n\n"; }

You see what I did there. Already in line 2 I extract some data from $webapp_usage_statistics and then never bother wit $webapp_usage_statistics again, because I know I'm only interested in that one node of seven days ago where John Doe is the user name.

And I go on like that: the moment I begin looping over John's visited pages of a week ago, I've made a variable that keeps a reference to one page at a time, so I can pull the information from it that I want, without constantly saying $johndoe->{pages}->[$idx]->{title}. Nope, just $page->{title}. And when it comes to showing the other users that visited that page that day I do the same trick. I make map loop over the dereferenced array of users so that I can just say $_->{username} from within the BLOCK. That really beats $webapp_usage_statistics->[-7]->[$some_index]->{pages}->[$some_other_index]->{others}->{username}, which is really what it boils down to.

That code's just there to illustrate a point. It might contain typos, bugs, or oddities. Also, I don't think that writing a webapp usage tracker in this fashion is really a good idea because the data structure is... awkward at best.

Replies are listed 'Best First'.
Re^2: Access Hashes of Hashes etc.
by TomDLux (Vicar) on Sep 14, 2011 at 16:19 UTC

    I often find it convenient to have a routine which claims to process all the records, which actually goes through the top-level keys and invokes another routine to processing the corresponding sub-nodes.

    I prefer to have short subroutines in any case, under 20 lines if at all possible, so this makes it easier to understand what each routine does. It's not like the 80s, when invoking a subroutine was drastically slower than doing things inline.

    It does generate a problem of coming up with names for several levels of subroutine: process_all_names(), process_one_name(), process_first_name(), process_last_name(), ...

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.

      I know what you mean. I'm currently in the middle of writing a JSON parser/generator, and I have these subroutines with lovely names such as _parser__handle_map, _parser__handle_array, _parser__handle_string, _parser__next_token_type, _parser__next_value, _to_json__format_hash, _to_json__format_array, and _to_json__format_string.

      Pretty straightforward names, and it positively beats those unruly if/elsif/.../else constructions.

      Basically a strategy like that does the same thing as I suggested but even takes it a step further. Instead of pulling data from the greater structure into a variable until you're at the deepest level you care about, you pull the data from the structure into a subroutine until you've got what you really wanted.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://925905]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2024-04-20 02:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found