Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
This will come close, but will fail if the match is followed by extra repetitions.
/(?:(?!(.)\1\1)[QGYN]){3,6}/;
You might have to consider each character separately, which leads to a long ugly string of alternations. The first char matches your character class. The second is either not a repeat, or is a repeat followed by not a repeat. The third is either not a repeat of the second, or a repeat followed by not a repeat.

After that, the pattern is repeated for the 4th and 5th characters, but they're all optional and nested (so if you don't have the 4th char, you don't look for the 5th). The 6th char doesn't need to check for repetitions, because it was checked by the pattern for the 5th char.

while ($seq{$k} =~ /(([QGYN]) ((?!\2)[QGYN]|\2(?!\2)) ((?!\3)[QGYN]|\3(?!\3)) (?:((?!\4)[QGYN]|\4(?!\4)) (?:((?!\5)[QGYN]|\5(?!\5)) [QGYN]?)?)?) /xg) { print "\n$k"; print $1." begins at position ", (pos($seq{$k})-length($s)) , "\n"; }
Update: adjusted to fit OP's code snippet.
Update2: As Ikegami noted (and I noted in responding to a different post), this solution has the problem of looking too far ahead. It won't take the first two characters out of a trio. A working regex-only solution is posted as a reply to this post.

Caution: Contents may have been coded under pressure.

In reply to Re: Perl regular expression for amino acid sequence by Roy Johnson
in thread Perl regular expression for amino acid sequence by seaver

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (1)
As of 2024-04-25 04:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found