comment on

OK, to save myself some typing, I'm going to simplify the foods database significantly. Here we go:

my %groups_to_foods = (
  proteins   => { eggs     => 1,
                  beef     => 1,
                  tofu     => 1 },
  carbs      => { bread    => 1,
                  pizza    => 1,
                  twinkies => 1 },
  vegetarian => { tofu     => 1,
                  bread    => 1,
                  twinkies => 1 }
);                                 # it's just an example!

# Is 'tofu' in 'carbs'?
my $is_tofu_in_carbs = defined $groups_to_foods{ carbs }{ tofu };

# How many foods in the 'vegetarian' group?
my $n_foods_in_vegetarian =
  scalar keys %{ $groups_to_foods{ vegetarian } }; 

# don't need "scalar" above; the assignment is already in
# scalar context.

# Next: invert the lookup table

my %foods_to_groups;
for my $group ( keys %groups_to_foods ) {
  my $hash_ref = $groups_to_foods{ $group };
  for my $food ( keys %$hash_ref ) {
    $foods_to_groups{ $food }{ $group } = 1;
  }
}

# How many groups does 'twinkies' belong to?
my $n_groups_for_twinkies =
  keys %{ $foods_to_groups{ twinkies } };

# How to change all the keys in %groups_to_foods to uppercase?
# We assume that all the keys start out being in lowercase

for my $group ( keys %groups_to_foods ) {
  $groups_to_foods{ uc $group } = $groups_to_foods{ $group };
  delete $groups_to_foods{ $group };
}
# Note that the values of the new hash are *identical* to
# the values of the old hash; they refer to the same
# locations in memory.  This is because these hash values
# are hash refs, and we just move them from one place to
# another, just like transferring the title of a home from
# one owner to the next owner leaves the home in place.

# Actually, the last solution can be made a little slicker,
# because the delete function returns the value
# corresponding to the deleted key:
for my $group ( keys %groups_to_foods ) {
  $groups_to_foods{ uc $group } =
    delete $groups_to_foods{ $group };
}

# Write a sub to do the same thing for any hash
# First we ignore the possibility of collisions; we
# assume, for instance, that the keys are guaranteed
# to be all in lowercase.
sub uc_keys {
  my $hash_ref = shift;
  $hash_ref->{ uc $_ } = 
    delete $hash_ref->{ $_ } for keys %$hash_ref;
}
# We could have omitted $_ in the call to uc, but let's
# not.

# With this definition, we can accomplish the uppercasing
# of the groups like this:
uc_keys( \ %groups_to_foods );

# OK, let's deal with the possibility of collisions.
# Here's one simple (perhaps too simple) approach
sub uc_keys {
  my $hash_ref = shift;
  {
    my %seen;
    for my $key ( keys %$hash_ref ) {
      my $uc_key = uc $key;
      die "Key collision! ($uc_key)\n"
        if defined $seen{ $uc_key }++;
    }
  }
  # if we made it this far, everything's ok
  # rest of code is exactly as before
  for my $key ( keys %$hash_ref ) {
    $hash_ref->{ uc $key } = delete $hash_ref->{ $key };
  }
}
# One objection to the above solution is that uc is called
# twice for every key in the original hash, which, for
# sufficiently strained code, or sufficiently slow values
# of uc, could be a problem.  An alternative solution that
# avoids this problem is this:

sub uc_keys_2 {
  my $hash_ref = shift;
  my %new_hash;
  for my $key ( keys %$hash_ref ) {
    my $uc_key = uc $key;
    die "Key collision! ($uc_key)\n"
      if exists $new_hash{ $uc_key };
    $new_hash{ $uc_key } = $hash_ref->{ $key };
  }
  return \ %new_hash;
}

# In contrast to the first version, which returned nothing,
# doing all its changes "in place", this one returns a
# hash_ref.  So, whereas uc_key would be used as already 
# shown above, uc_keys_2 would be used like this:
%groups_to_foods = %{ uc_keys_2( \ %groups_to_foods ) };

# CAUTION: if instead of the last line one had used
my %new_g2f =  %{ uc_keys_2( \ %groups_to_foods ) };

# followed by
$new_g2f{ CARBS }{ pasta } = 1;

# this would result in changing the data in
# %groups_to_foods, so that now the expression
# $groups_to_foods{ carbs }{ pasta } equals 1.
# This is because now the contents of $new_g2f{ CARBS }
# and $groups_to_foods{ carbs } are *the same
# hash ref*!  This is leading into the topic of
# deep copying.  I give a link to an article on
# deep copying below.

# Last one: generalize the last sub so that it can take both
# a sub and a hash and modifies the keys of the hash with
# the input sub.  We'll take the same tack as with the
# last version of the last solution.  The code is almost
# identical.
sub modify_keys {
  my $sub = shift;
  my $hash_ref = shift;
  my %new_hash;
  for my $key ( keys %$hash_ref ) {
    my $new_key = $sub->( $key );
    die "Key collision! ($new_key)\n"
      if exists $new_hash{ $new_key };
    $new_hash{ $new_key } = $hash_ref->{ $key };
  }
  return \ %new_hash;
}
[download]

The promised ref to the article on deep-copying (by merlyn) is here.

the lowliest monk

In reply to Re: References workout by tlm
in thread Continuing from "Turning foreach into map?" - perlreftut and References by ghenry

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Do you know where your variables are?
	PerlMonks