Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: non-exact regexp matches

by vinforget (Beadle)
on Jun 23, 2004 at 15:32 UTC ( [id://369075]=note: print w/replies, xml ) Need Help??


in reply to Re: non-exact regexp matches
in thread non-exact regexp matches

Optimally, I want to match any regexp, but I am not sure if regexps in perl can handle this in a stable fashion. I've been using regexp to report all nested pattern matches with positions of matches using $-[0]:
m/(regexp)(?{ print $-[0] )(?!)/;

but everywhere I look most people say to stay away from this stuff because:
1) it is not stable
2) it may not be supported in newer versions of perl.
So I'm not sure if I should take a more specific yet stable approach, or a generalisable yet potentialy unstable approach.

Replies are listed 'Best First'.
Re: non-exact regexp matches
by Abigail-II (Bishop) on Jun 23, 2004 at 15:52 UTC
    I don't see the connection between using the (?{ }) and (?!) to report all matches and your original question of finding "partial" matches.

    But so, you want any regexp to match fuzzy. However, then your example is unclear - it's picking out positions in the regex (not in the string) to indicate where characters should be changed. Do you also want to be able to change special characters? Is it ok to introduce characters in the regex to make it match? (That would be easy, just add a | as the first character in the regex).

    Abigail

      I refined my question a little more. I have a string of letters [ACGTacgtNn] from which I want to find a particular instance of a regexp, let's say:
      /ACCAAC[ACGTacgtNn]{6}CTA[ACGTacgtNn]{1}ATG[ACGTacgtNn]{1,2}GATGTT/

      I can do this just fine, but what if I want to match the above regexp with a tolerance of 2 minmatches for single characters. Below I have an example:
      $buf =~ m/(A)(C)(C)(A)(A)(C)([ACGTacgtNn]{6})(CTA[ACGTacgtNn]{1})(A)(T +)(G)([ACGTacgtNn]{1,2})(G)(A)(T)(G)(T)(T)(?{ print $-[0]," ",scalar@-,"\n"; })(?!)/;
      this will print the position of the match in $buf, followed by 19 (the number of submatches). I want to be able to return a match from 17-19 submathes, not just all 19. Thanks. Vince
        Will this do?
        use re 'eval'; no strict 'refs'; if (/(A)?(C)?(C)?(A)?(A)?(C)?([ACGTacgtNn]{6})?(CTA[ACGTacgtNn]{1} +)? (A)?(T)?(G)?([ACGTacgtNn]{1,2})?(G)?(A)?(T)?(G)?(T)?(T)? (?(?{17 <= grep {defined $$_} 1 .. 19})|(?!))/x) { ... }

        Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://369075]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-19 03:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found