Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Seeking algorithm for finding common continous sub-patterns

by mattr (Curate)
on Dec 04, 2004 at 18:12 UTC ( [id://412423]=note: print w/replies, xml ) Need Help??


in reply to Seeking algorithm for finding common continous sub-patterns

I cannot add much now to the interesting responses above, but would like to note that your problem has some resemblences to the problem of finding matching sequences in genomes.

Also I found a nice paper of academic interest about sequence searching (not exactly your problem, but I couldn't resist as it has nice hashing of multidimensional tables at around section 4.3) It involves sliding a window over the target and storing the slopes of segments in very big hashes (if I understand as much as I read). I'd like to mess more with this but it's 3am here..

If you do not need an exhaustive list of all pattern s but just the most interesting ones, or statistically significant ones allowing for a number of sequence errors, biological code is more interesting for you. You could look for example at how they do RepeatFinder at TIGR if curious.

Also you could google about "hidden Markov", "interpolated Markov" or Viterbi which are used often to find hidden sequences or attempt predictions of what will come next in a sequence.

  • Comment on Re: Seeking algorithm for finding common continous sub-patterns

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://412423]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-25 21:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found