Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Is there any way to ignore certain words and keep it as it is when substituing hash values to a matched pattern in a string?

by skooma (Novice)
on Apr 02, 2018 at 09:04 UTC ( [id://1212142]=perlquestion: print w/replies, xml ) Need Help??

skooma has asked for the wisdom of the Perl Monks concerning the following question:

So I have a regex in perl which goes like,

my $texttosub = "log10(blackcat)"; #the value of "blackcat" can be found in a hash called "%cats" while ( $texttosub =~ s|([a-zA-Z][A-Za-z_0-9]+)|$cats{$1}|i ){ + print ("\n", " The value of cat = ", eval ($texttosub) ); ..do something.. } sub log10{....}

My question is, How do I ignore "log10" and only match "blackcat" for substitution? So that I can evaluate that "$texttosub" line and print of the log10 value of that "blackcat". What I am looking for is, say blackcat=>5, whitecat=>10,orangecat=>20, then, $texttosub = "log10(blackcat)*whitecat*(log10(orangecat))" ===>must become log10(5)*10*(log10(20)). I tried this,

while ( $texttosub =~ s/(?!log10)|([a-zA-Z][A-Za-z_0-9]+)/$cats{$1}/i + ){ ..do something.. }

But I am getting an infinite loop for some reason. Thanks in advance.

  • Comment on Is there any way to ignore certain words and keep it as it is when substituing hash values to a matched pattern in a string?
  • Select or Download Code

Replies are listed 'Best First'.
Re: Is there any way to ignore certain words and keep it as it is when substituing hash values to a matched pattern in a string?
by Corion (Patriarch) on Apr 02, 2018 at 09:10 UTC

    You will be much better suited by using a proper parser than hoping to do the parsing through a regular expression.

    That said, it seems to me that you only want to replace variable names and not function names, so, barring any special/convenient notation for your variable names, I can only suggest that you build a proper regular expression out of your variable names:

    my $variables = join "|", map { "\\b$_\\b" } reverse sort keys %cats; $variables = qr/$variables/; $texttosub =~ s|($variables)|$cats{ $1 }|g; # now there are no more variables, so we can evaluate the thing

    You have two logic errors in your code:

    While your string still contains variable names, it makes no sense to try to evaluate it.

    Your code assumes that variable names should be matched independent of case, but your replacement can only replace the case that was stored in %cats. Your code will never replace blackCat with the value for blackcat.

      join "|", map { "\\b$_\\b" } reverse sort keys %cats;

      I'd suggest putting a quotemeta in there, just to play it safe. Although I agree that using a proper parser is better! skooma: See also Building Regex Alternations Dynamically.

      use warnings; use strict; use Test::More; my %cats = (blackcat=>5, whitecat=>10,orangecat=>20); my ($regex) = map { qr/\b($_)\b/ } join '|', map {quotemeta} sort { length $b <=> length $a or $a cmp $b } keys %cats; diag explain $regex; sub do_replace { my $input = shift; $input =~ s/$regex/$cats{$1}/g; return $input; } is do_replace("log10(blackcat)"), "log10(5)"; is do_replace("log10(blackcat)*whitecat*(log10(orangecat))"), "log10(5)*10*(log10(20))"; done_testing; __END__ # qr/\b(orangecat|blackcat|whitecat)\b/ ok 1 ok 2 1..2
Re: Is there any way to ignore certain words and keep it as it is when substituing hash values to a matched pattern in a string?
by tybalt89 (Monsignor) on Apr 02, 2018 at 13:15 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1212142 use strict; use warnings; my %cats = ( blackcat=>5, whitecat=>10,orangecat=>20 ); while(<DATA>) { print; s#([a-zA-Z][A-Za-z_0-9]+)# $cats{$1} // $1 #ge; print; } __DATA__ log10(blackcat)*whitecat*(log10(orangecat))

    Outputs:

    log10(blackcat)*whitecat*(log10(orangecat)) log10(5)*10*(log10(20))
Re: Is there any way to ignore certain words when substituing?
by haukex (Archbishop) on Apr 02, 2018 at 10:42 UTC
    $texttosub =~ s/(?!log10)|([a-zA-Z][A-Za-z_0-9]+)/$cats{$1}/i
    But I am getting an infinite loop for some reason.

    You don't need the alternation operator |. Lookaround Assertions like (?!...) are zero-width (tutorial). So the regex / (?!foo) | bar /x means "match a zero-length string as long as the next thing isn't foo, or match bar".

    However, even with the alternation operator removed, the regex still won't do what you expect: the pattern basically means "any letters or numbers, as long as the next thing isn't log10", so the result will be "l(5)", because the substring og10 matches the pattern and there is no key og10 in %cats!

    Although I've already said that I agree with Corion that using a proper parser is better, and I've shown a different solution, just for the sake of completeness and TIMTOWTDI, here are two additional ways to do what you want. First, you can use the word boundary \b to make sure that the string being matched isn't just a portion of a longer identifier: s/(?!log10)(\b[a-zA-Z][A-Za-z_0-9]+)/$cats{$1}/ works on the sample input you've shown.

    Second, you could look at all identifiers, and then figure out what they are once you've matched them. Here, I'm taking advantage of the /e modifier to execute the replacement part as Perl code:

    use warnings; use strict; my %cats = (blackcat=>5, whitecat=>10,orangecat=>20); my %funcs = map {$_=>1} qw/ log10 sin cos /; # etc. my $texttosub = "log10(blackcat)*whitecat*(log10(orangecat))"; $texttosub =~ s{(\b[a-zA-Z][a-zA-Z0-9_]+\b)}{ my $repl; if ($cats{$1}) { $repl = $cats{$1} } elsif ($funcs{$1}) { $repl = $1 } else { die "Unknown identifier '$1'" } print "Matched '$1', replacement '$repl'\n"; # Debug $repl }eg; print $texttosub, "\n"; __END__ Matched 'log10', replacement 'log10' Matched 'blackcat', replacement '5' Matched 'whitecat', replacement '10' Matched 'log10', replacement 'log10' Matched 'orangecat', replacement '20' log10(5)*10*(log10(20))
Re: Is there any way to ignore certain words when substituing?
by haukex (Archbishop) on Apr 02, 2018 at 11:14 UTC

      My bad. I thought the sites were unrelated.

      Edit: Thanks let me check it out.

        I thought the sites were unrelated.

        They are unrelated — except they're both on the Interwebs! Some folks attend one or the other or both (or neither, of course). Also, please see How do I change/delete my post? for site etiquette and protocol regarding changing your post.


        Give a man a fish:  <%-{-{-{-<

Re: Is there any way to ignore certain words and keep it as it is when substituing hash values to a matched pattern in a string?
by mr_ron (Chaplain) on Apr 03, 2018 at 00:57 UTC

    Appropriating from ideas of several other postings including the OP, tybalt89's example, haukex' s use of \b, and a hopefully plausible approach to Corion's concern about case sensitivity, perhaps the following is simple but still adequate. Lower casing the "cat" may or may not be the exact right approach to being case insensitive.

    use strict; use warnings; my %cats = ( blackcat=>5, whitecat=>10,orangecat=>20 ); my $texttosub = 'log10(blackcat)*whitecat*(log10(orangeCat))'; $texttosub =~ s/\b(?!log10)([a-zA-Z][A-Za-z_0-9]+)\b/$cats{lc $1}/ige; print "$texttosub\n";

    Output

    log10(5)*10*(log10(20))
    Ron

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1212142]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2024-04-18 17:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found