Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

hash key regular expression pattern

by wsee (Initiate)
on Nov 07, 2002 at 09:26 UTC ( #211023=perlquestion: print w/replies, xml ) Need Help??

wsee has asked for the wisdom of the Perl Monks concerning the following question:

I have a big Hash list, the some of the keys are made up of the combination of regular expression pattern. For example :
%cat = ( 'C[AEIOU]NWAY' => 'AAA', '(BOSTON|CHICAGO)' => 'BBB', '(LAS VEGAS|NEVADA)' => 'CCC' '(DALLAS|AUSTIN)' => 'DDD' '(NASHVILLE|M[AEIOU]MPHIS)' => 'EEE' );

I would like to search the KEYS and compare to a $variable, and print the VALUE for the matched $variable.

For example, if the $variable = 'CHICAGO', then I would like to pull the corresponding value which is 'BBB'.

Could anyone give me some idea or suggestion?

Thank you...

William

Replies are listed 'Best First'.
Re: hash key regular expression pattern
by dada (Chaplain) on Nov 07, 2002 at 09:53 UTC
    use Tie::RegexpHash from CPAN!

    there's also Tie::Hash::Regex by our fellow davorg, but I can't see the difference between the two :-)

    cheers,
    Aldo

    King of Laziness, Wizard of Impatience, Lord of Hubris

      There is a difference, but it's a bit subtle.

      With Tie::RegexpHash you set your hash keys to regular expressions and then look them up using fixed strings.

      With Tie::Hash::Regex you set the keys to fixed strings and then look them up using regular expressions.

      In this case, it looks like Tie::RegexpHash is the way to go.

      --
      <http://www.dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

Re: hash key regular expression pattern
by BrowserUk (Patriarch) on Nov 07, 2002 at 09:52 UTC

    Try this

    #! perl -sw use strict; my %cat = ( 'C[AEIOU]NWAY' => 'AAA', '(BOSTON|CHICAGO)' => 'BBB', '(LAS VEGAS|NEVADA)' => 'CCC', '(DALLAS|AUSTIN)' => 'DDD', '(NASHVILLE|M[AEIOU]MPHIS)' => 'EEE', ); my $variable = 'CHICAGO'; for (keys %cat) { print $cat{$_}, $/ if $variable =~ /$_/; } __END__ #Output c:\test>211023 BBB c:\test>

    Nah! You're thinking of Simon Templar, originally played (on UKTV) by Roger Moore and later by Ian Ogilvy
      Two suggestions:

      1. Precompile the regexs
        qr/C[AEIOU]NWAY/ => 'AAA' ...
      2. Consider storing the regexes in an array, and keeping a separate lookup hash for the values ( or using a Tied hash whose key order you can control ). This will let you put the most common cases first, assuming you exit the loop on the first match, and that you are optimizing for speed.
Re: hash key regular expression pattern
by snowcrash (Friar) on Nov 07, 2002 at 09:57 UTC
    Hi! A quick search on the CPAN and I found Tie::RegexpHash, which seems to suit your needs. I've never used it, but here's an example from the man page:
    use Tie::RegexpHash; my %hash; tie %hash, 'Tie::RegexpHash'; $hash{ qr/^5(\s+|-)?gal(\.|lons?)?/i } = '5-GAL'; $hash{'5 gal'}; # returns "5-GAL" $hash{'5GAL'}; # returns "5-GAL" $hash{'5 gallon'}; # also returns "5-GAL"

    snowcrash
Re: hash key regular expression pattern
by djantzen (Priest) on Nov 07, 2002 at 09:54 UTC

    You could run through the keys and do a pattern match on each one, like:

    my %cat = ('something' => 'stuff',); my $variable = 'something'; my $match; while (my ($key, $value) = each %cat) { if ($variable =~ /$key/) { $match = $value and last; } }

    Unfortunately this really nullifies the primary usefullness of a hash, e.g., efficient and straightforward dictionary lookup. What would be a better solution, if you can change the manner in which you store your data, is to use Tie::Hash::Regex, which enables you to use a regular expression to do key lookups. Very spiffy IMO, and written by a couple of local monks too.

    Update: fixed two typos in the code snippet.

      Isn't that just pushing the loop under the covers. Or is there some benefit I am missing here?


      Nah! You're thinking of Simon Templar, originally played (on UKTV) by Roger Moore and later by Ian Ogilvy

        Isn't using the Tie::Hash::Regex module just pushing the loop under the covers? I dunno, I didn't write it, although if it's a pure Perl module then I would suspect that's how it is implemented. Personally I'd like to see that functionality available in perl itself rather than as a tie'd module, though more for speed than for convenience. After all, a standard hash lookup is remarkably faster than a foreach over a hash (or at least it was last time I compared the two methods with a tie'd DBM of about 50,000 entries).

        Update: a brief perusal of the code shows a combination in the FETCH subroutine of for and qr to prevent recompilation of the regex. The real meat of the code is:

        my $key = qr/$key/; /$key/ and return $self->{$_} for keys %$self;

Re: hash key regular expression pattern
by dingus (Friar) on Nov 07, 2002 at 09:53 UTC
    %cat = ( 'C[AEIOU]NWAY' => 'AAA', '(BOSTON|CHICAGO)' => 'BBB', '(LAS VEGAS|NEVADA)' => 'CCC', '(DALLAS|AUSTIN)' => 'DDD', '(NASHVILLE|M[AEIOU]MPHIS)' => 'EEE' ); $variable = 'CHICAGO'; for (keys %cat) { next unless $variable =~ /$_/; print "$variable maps to $cat{$_}\n"; last; }

    Dingus


    Enter any 47-digit prime number to continue.
(z) Re: hash key regular expression pattern
by zigdon (Deacon) on Nov 07, 2002 at 14:26 UTC
    What about something like this?:
    %cat = ( 'C[AEIOU]NWAY' => 'AAA', '(?:BOSTON|CHICAGO)' => 'BBB', '(?:LAS VEGAS|NEVADA)' => 'CCC', '(?:DALLAS|AUSTIN)' => 'DDD', '(?:NASHVILLE|M[AEIOU]MPHIS)' => 'EEE' ); $var = "CHICAGO"; @keys = keys %cat; $re = "(".join(")|(", @keys).")"; if ($var =~ /$re/o) { for ( 1 .. @keys) { print "$_:", $$_, " - ", $keys[$_-1], "\n" if defined $$_ } }

    Problems:

    • It's not strict refs.
    • it still loops on the number of matches (thought it only does one regexp match).
    • I had to change all the '('s in the data to '(?:'.

    I think the for loop could be replaced with one of the magical $+ or $- vars, but I couldn't find exactly the one I'm looking for - "The var that tells me how many of $1, $2 were defined in the last match".

    -- Dan

      It's $+ (see perlvar).

      Nice idea.

Re: hash key regular expression pattern
by thor (Priest) on Nov 07, 2002 at 14:05 UTC
    In addition to what everyone else has said, thor has to wonder if you can use compiled regexen (obtained using the qr() operator) as the hash keys. This would speed up your search.

    thor

Re: hash key regular expression pattern
by wsee (Initiate) on Nov 07, 2002 at 22:37 UTC
    Thank you for all the great suggestions and tips. I really appreciated.
      use List::Util qw<first>; print $cat{ ( first { m/CHICAGO/i } keys %cat ) || '' };

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://211023]
Approved by djantzen
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2022-06-26 05:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My most frequent journeys are powered by:









    Results (83 votes). Check out past polls.

    Notices?