Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: n-dimensional statistical analysis of DNA sequences (or text, or ...)

by Laurent_R (Canon)
on Jan 22, 2019 at 18:18 UTC ( [id://1228832]=note: print w/replies, xml ) Need Help??


in reply to n-dimensional statistical analysis of DNA sequences (or text, or ...)

Hi bliako,

You might have some interest in reading Section 11.13 (Markov Analysis) of my Perl 6 book (http://greenteapress.com/thinkperl6/thinkperl6.pdf), which suggests an exercise on a quite similar subject, as well as Subsection A.9.5. presenting a solution to the aforesaid exercise. The exercise and its solution are doing much simpler things than your module, but it goes in the same direction: looking into a text for the probability, for a given sequence of words, of the words that might come next. Then using that probability to generate random sentences that might almost look like English (at least much more so than just picking random words). For example, running the program on Emma, the novel by Jane Austen, produced the following random text:

it was a black morning’s work for her. the friends from whom she could not have come to hartfield any more! dear affectionate creature! you banished to abbey mill farm. now i am afraid you are a great deal happier if she had no hesitation in approving. dear harriet, i give myself joy of so sorrowful an event;
As you can see, the result is almost syntactically correct, but not quite. And, semantically, it almost makes sense, but not quite.

I hope you find it fun.

  • Comment on Re: n-dimensional statistical analysis of DNA sequences (or text, or ...)

Replies are listed 'Best First'.
Re^2: n-dimensional statistical analysis of DNA sequences (or text, or ...)
by bliako (Monsignor) on Jan 22, 2019 at 18:52 UTC

    Thanks for the book Laurent_R and letting me know of your case-study. I may use your book to get my first contact with Perl 6.

    As you say, selecting the data structure is challenging. The algorithm is there but different data structure will make it more convenient to use the algorithm in different tasks.

    I am interested in Statistical Learning and in this algorithm (and Monte Carlo) which is so simple. And Perl gives us so much freedom in trying things and whipping code in no time.

    I give myself joy...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1228832]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-25 21:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found