Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Hash of Regex

by bichonfrise74 (Vicar)
on Apr 06, 2010 at 21:52 UTC ( #833133=perlquestion: print w/replies, xml ) Need Help??

bichonfrise74 has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to build a hash of possible substitutes for a given word or regex. So, I came up with this.
#!/usr/bin/perl use strict; my $test = 'The brown fox int(10) over float(200) fence.'; my %dict = ( 'brown' => 'yellow', /int(\d+)/ => 'int', /float(\d+)/ => 'float', ); for my $i (keys %dict) { $test =~ s/($i)/$dict{$i}/gi; } print "$test\n";
This is the output.
The yellow fox float(10) over float(200) fence.
My expected output is something like this.
The yellow fox int over float fence.
So, I'm not sure if I can put regex in the keys of my hash and use that to check for possible substitutes. Also, why did int(10) become float(10) in the output??

Replies are listed 'Best First'.
Re: Hash of Regex
by zwon (Abbot) on Apr 06, 2010 at 21:59 UTC

    Try to check content of the hash:

    use strict; my $test = 'The brown fox int(10) over float(200) fence.'; my %dict = ( 'brown' => 'yellow', /int(\d+)/ => 'int', /float(\d+)/ => 'float', ); use Data::Dumper; warn Dumper \%dict; __END__ $VAR1 = { 'int' => 'float', 'brown' => 'yellow' };
    And if you'd use warnings...

    Perhaps you also will be interested in:

    my %dict = ( 'brown' => 'yellow', qr/int(\d+)/ => 'int', qr/float(\d+)/ => 'float', );

    Update: And what you actually want, if I got you right, is:

    use strict; my $test = 'The brown fox int(10) over float(200) fence.'; my %dict = ( 'brown' => 'yellow', 'int\(\d+\)' => 'int', 'float\(\d+\)' => 'float', ); for my $i ( keys %dict ) { $test =~ s/$i/$dict{$i}/gi; } print "$test\n";
      $VAR1 = { 'int' => 'float', 'brown' => 'yellow' };

      (note to the OP)  In case you wonder why this is:

      my %dict = ( 'brown' => 'yellow', /int(\d+)/ => 'int', /float(\d+)/ => 'float', );

      is the same as

      my %dict = ( 'brown' => 'yellow', $_ =~ /int(\d+)/ => 'int', $_ =~ /float(\d+)/ => 'float', );

      which (in this case) is the same as

      my %dict = ( 'brown' => 'yellow', () => 'int', () => 'float', );

      which is the same as

      my %dict = ( 'brown' => 'yellow', 'int' => 'float', );
        Why / how did this
        my %dict = ( 'brown' => 'yellow', () => 'int', () => 'float', );
        become this??
        my %dict = ( 'brown' => 'yellow', 'int' => 'float', );
      my %dict = ( 'brown' => 'yellow', qr/int(\d+)/ => 'int', qr/float(\d+)/ => 'float', );
      This is the output, but what does it mean and why doesn't it work?
      $VAR1 = { '(?-xism:int(\\d+))' => 'int', 'brown' => 'yellow', '(?-xism:float(\\d+))' => 'float' };
      This looks more elegant than this.
      my %dict = ( 'brown' => 'yellow', 'int\(\d+\)' => 'int', 'float\(\d+\)' => 'float', );

        Only strings can be hash keys. So when building %dict with regular expressions (the qr// constructs), they are converted to strings to become keys in the hash1. For regular expressions, incidentally, these are strings that can be compiled back into the same regular expression they stemmed from - which is quite convenient.

        So why did this not do anything to bichonfrisee74's input strings?

        my %dict = ( 'brown' => 'yellow', qr/int(\d+)/ => 'int', qr/float(\d+)/ => 'float', );
        I was scratching my head myself until I use'd re 'debug'. zwon happened to forget to escape the parentheses in his (intermediary) example. So if you use the (correctly written) qr// constructs below, things "work" as expected:
        my %dict = ( 'brown' => 'yellow', qr/int\(\d+\)/ => 'int', qr/float\(\d+\)/ => 'float', );

        1The ?-xism used to confuse me until I realized this just explicitly states: turn off the x, i, s and m regex flags.

      Perhaps you also will be interested in:</p.

      my %dict = ( 'brown' => 'yellow', qr/int(\d+)/ => 'int', qr/float(\d+)/ => 'float', );
      That doesn't do anything for me.

      Edit: Woohoo. More negative rep. Keep it coming. I'm shooting for -100,000 by next week.
        That doesn't do anything for me.
        That shouldn't. OP asked if he can use regexp as a hash key, so I think he may be interested in inspecting content of mentioned hash with the help of Data::Dumper
Re: Hash of Regex
by nvivek (Vicar) on Apr 07, 2010 at 05:13 UTC
    When you want use those as regular expression pattern for substitute command,there is no need to give match(//) operator to match those pattern.Because,you want to replace get the pattern from hash key and put it in the substitute command.For this,you can try the following script.
    my $test = 'The brown fox int(10) over float(200) fence.'; my %dict = ( 'brown' => 'yellow', 'int\(\d+\)' => 'int', 'float\(\d+\)' => 'float', ); for my $i (keys %dict) { $test =~ s/($i)/$dict{$i}/gi; } print $test;
    For better understanding,you can print the internal structure of hash by using Dumper function. You need to escape the () parentheses,because it is used for grouping in substitute command.
      Or to state nvivek's point another way:

      /int\(\d+\)/ is not a regular expression

      It is the match operator //

      with a regular expression int\(\d+\)

      After all, you wouldn't write

      $test =~ s//int\(\d+\)//int/gi;
      would you?


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://833133]
Approved by Corion
Front-paged by biohisham
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2022-09-27 04:08 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (118 votes). Check out past polls.