http://qs321.pair.com?node_id=369067


in reply to Re^4: non-exact regexp matches
in thread non-exact regexp matches

You are talking about regexes, but your example shows the most trivial regex one can image, namely one that doesn't contain any characters that are special. Do you want to match any possible regex, or are you just looking for matching strings? The latter is far, far more easier than the former - and the latter doesn't need the regex engine at all.

Abigail

Replies are listed 'Best First'.
Re^2: non-exact regexp matches
by vinforget (Beadle) on Jun 23, 2004 at 15:32 UTC
    Optimally, I want to match any regexp, but I am not sure if regexps in perl can handle this in a stable fashion. I've been using regexp to report all nested pattern matches with positions of matches using $-[0]:
    m/(regexp)(?{ print $-[0] )(?!)/;

    but everywhere I look most people say to stay away from this stuff because:
    1) it is not stable
    2) it may not be supported in newer versions of perl.
    So I'm not sure if I should take a more specific yet stable approach, or a generalisable yet potentialy unstable approach.
      I don't see the connection between using the (?{ }) and (?!) to report all matches and your original question of finding "partial" matches.

      But so, you want any regexp to match fuzzy. However, then your example is unclear - it's picking out positions in the regex (not in the string) to indicate where characters should be changed. Do you also want to be able to change special characters? Is it ok to introduce characters in the regex to make it match? (That would be easy, just add a | as the first character in the regex).

      Abigail

        I refined my question a little more. I have a string of letters [ACGTacgtNn] from which I want to find a particular instance of a regexp, let's say:
        /ACCAAC[ACGTacgtNn]{6}CTA[ACGTacgtNn]{1}ATG[ACGTacgtNn]{1,2}GATGTT/

        I can do this just fine, but what if I want to match the above regexp with a tolerance of 2 minmatches for single characters. Below I have an example:
        $buf =~ m/(A)(C)(C)(A)(A)(C)([ACGTacgtNn]{6})(CTA[ACGTacgtNn]{1})(A)(T +)(G)([ACGTacgtNn]{1,2})(G)(A)(T)(G)(T)(T)(?{ print $-[0]," ",scalar@-,"\n"; })(?!)/;
        this will print the position of the match in $buf, followed by 19 (the number of submatches). I want to be able to return a match from 17-19 submathes, not just all 19. Thanks. Vince