Optimizing quickly hash tables by checking multiple conditions

juo has asked for the wisdom of the Perl Monks concerning the following question:

Hello,
I have some very large hash tables and I need to get all the tables which fit certain conditions. I am looking for an easy and fast way to do this. I know in SQL this could be easily done but I am not sure if for hash table this would apply.
Below you have a sample : The hast table can be divided into 6 sections (L1-L6). I want to create a new hash tables which only would contain sections from where several conditions have been fullfilled. For example :
- The side has to be the same
- The CPN has to be the same
- The number of states and the state number has to be the same.
You will see that the new result will have only 4 sections. For two sections the conditions were fullfilled. Does anybody have an idea to query for several conditions in a big hash table fast and easy. I put in a sample piece of a hast table

Source hash table :

001 => HASH(0x1e19710)
   'CPN' => 'A1'
   'REFDES' => 'L1'
   'SIDE' => 'top'
   'STATES' => HASH(0x1e19740)
      STATE1 => HASH(0x1e19758)
         'STATECPN' => 43124388002
      STATE2 => HASH(0x1e19758)
         'STATECPN' => 43124388003
      STATE3 => HASH(0x1e19758)
         'STATECPN' => 43124388023
002 => HASH(0x1e19710)
   'CPN' => 'A1'
   'REFDES' => 'L2'
   'SIDE' => 'top'
   'STATES' => HASH(0x1e19740)
      STATE1 => HASH(0x1e19758)
         'STATECPN' => 43124388002
      STATE2 => HASH(0x1e19758)
         'STATECPN' => 43124388003
003 => HASH(0x1e19710)
   'CPN' => 'A1'
   'REFDES' => 'L3'
   'SIDE' => 'top'
   'STATES' => HASH(0x1e19740)
      STATE1 => HASH(0x1e19758)
         'STATECPN' => 43124388002
      STATE2 => HASH(0x1e19758)
         'STATECPN' => 43124388003
      STATE3 => HASH(0x1e19758)
         'STATECPN' => 43124388023
004 => HASH(0x1e19710)
   'CPN' => 'A2'
   'REFDES' => 'L4'
   'SIDE' => 'top'
   'STATES' => HASH(0x1e19740)
      STATE1 => HASH(0x1e19758)
         'STATECPN' => 43124388002
      STATE2 => HASH(0x1e19758)
         'STATECPN' => 43124388003
      STATE3 => HASH(0x1e19758)
         'STATECPN' => 43124388023
005 => HASH(0x1e19710)
   'CPN' => 'A1'
   'REFDES' => 'L5'
   'SIDE' => 'bottom'
   'STATES' => HASH(0x1e19740)
      STATE1 => HASH(0x1e19758)
         'STATECPN' => 43124388002
      STATE2 => HASH(0x1e19758)
         'STATECPN' => 43124388003
      STATE3 => HASH(0x1e19758)
         'STATECPN' => 43124388023
006 => HASH(0x1e19710)
   'CPN' => 'A1'
   'REFDES' => 'L6'
   'SIDE' => 'top'
   'STATES' => HASH(0x1e19740)
      STATE1 => HASH(0x1e19758)
         'STATECPN' => 43124388002
      STATE2 => HASH(0x1e19758)
         'STATECPN' => 43124388003


Result should be something like this :


top => HASH(0x1e19710)
   001 => HASH(0x1e19710)
       'CPN' => 'A1'
       'REFDES' => ARRAY(0x1d15ef0)
          0  'L6'
          1  'L2'
       'STATES' => HASH(0x1e19740)
          STATE1 => HASH(0x1e19758)
             'STATECPN' => 43124388002
          STATE2 => HASH(0x1e19758)
             'STATECPN' => 43124388003
   002 => HASH(0x1e19710)
       'CPN' => 'A1'
       'REFDES' => ARRAY(0x1d15ef0)
          0  'L1'
          1  'L3'
       'STATES' => HASH(0x1e19740)
          STATE1 => HASH(0x1e19758)
             'STATECPN' => 43124388002
          STATE2 => HASH(0x1e19758)
             'STATECPN' => 43124388003
          STATE3 => HASH(0x1e19758)
             'STATECPN' => 43124388023
   003 => HASH(0x1e19710)
          'CPN' => 'A2'
           'REFDES' => ARRAY(0x1d15ef0)
              0  'L4'
           'STATES' => HASH(0x1e19740)
              STATE1 => HASH(0x1e19758)
                 'STATECPN' => 43124388002
              STATE2 => HASH(0x1e19758)
                 'STATECPN' => 43124388003
              STATE3 => HASH(0x1e19758)
                 'STATECPN' => 43124388023
bottom => HASH(0x1e19710)
   001 => HASH(0x1e19710)
       'CPN' => 'A1'
           'REFDES' => ARRAY(0x1d15ef0)
              0  'L5'
       'SIDE' => 'bottom'
       'STATES' => HASH(0x1e19740)
          STATE1 => HASH(0x1e19758)
             'STATECPN' => 43124388002
          STATE2 => HASH(0x1e19758)
             'STATECPN' => 43124388003
          STATE3 => HASH(0x1e19758)
             'STATECPN' => 43124388023
[download]

Comment on Optimizing quickly hash tables by checking multiple conditions Download Code

Replies are listed 'Best First'.
Re: Optimizing quickly hash tables by checking multiple conditions by revdiablo (Prior) on May 05, 2004 at 06:03 UTC
What have you tried? Did the obvious solution (i.e. `map { $_ => $hash{$_} } grep { foo($hash{$_}) and bar($hash{$_}) } keys %hash`) not work for you? Why not? Do you have any other ideas how this could be accomplished? Did you try them? What about them failed? You should ask yourself these questions. Before you post. And then answer these questions in your post. I think you'll get more response this way.	[reply] [d/l]
Re: Re: Optimizing quickly hash tables by checking multiple conditions by juo (Curate) on May 05, 2004 at 06:56 UTC
Hello, I am not that experienced with the Perl and I am not a programmer by nature although I have been touched by Perl since years, I still have no idea on how to use map,grep and bar inside hash tables. I can use it on arrays but that would be about it. I have tried to make it work the easy to understand way by using foreach loops and if statement and then pushing everything into a new hash but got totally stuck as you would be able to imagine. I could not find much documentation on the different handling possibilities of hash tables although I don't tend to use anything else then hash tables in my codes. If anybody has some good documentation/examples on how to use functions with hash tables like map and grep that would be great.	[reply]
Re: Re: Re: Optimizing quickly hash tables by checking multiple conditions by revdiablo (Prior) on May 05, 2004 at 16:37 UTC
I still have no idea on how to use map,grep ... inside hash tables The basic technique in my example is fairly simple. First we filter out the hash keys we want, with grep. Once that list of wanted keys is obtained, we have to turn it back into a hash. This is what the map is for. The `foo` and `bar` were just examples. Those are your selection criteria (i.e. how you decide which keys you want to toss and which you want to keep). Here it is laid out a bit more clearly. Remember, since the pipeline goes from right to left, read the comments from bottom to top: `my %newhash = # keys/values into a hash map { $_ => $hash{$_} } # a list of keys/values grep { foo($hash{$_}) # just the keys we want and bar($hash{$_}) } keys %hash; # a list of keys in %hash` [download] Now, if there are a lot of keys in your hash (as you seem to imply), building all these big lists might be rather slow. I would try it before ruling it out, though. Also note that while we're making a whole new hash, the sub-hashes are not going to all get copied. The references will get copied, but they'll be referring to the same anonymous hashes as before. This shouldn't be a big bottleneck (but, again, without trying it's hard to say for sure). I have tried to make it work the easy to understand way by using foreach loops and if statement and then pushing everything into a new hash but got totally stuck This might be the way you want to go, if the map and grep solution is still too confusing. Again, if you show us the code you tried, we can try to help figure out why it's not working. You might want to post a new question, then reply here with a link to it. That way you get more people looking at it, but people following the discussion here can still follow the new post.	[reply] [d/l] [select]
Re: Re: Re: Re: Optimizing quickly hash tables by checking multiple conditions by juo (Curate) on May 06, 2004 at 02:44 UTC
Re: Re: Re: Re: Re: Optimizing quickly hash tables by checking multiple conditions by revdiablo (Prior) on May 06, 2004 at 04:50 UTC
Re: Re: Re: Re: Optimizing quickly hash tables by checking multiple conditions by juo (Curate) on May 06, 2004 at 05:01 UTC


Syntactic Confectionery Delight
	PerlMonks