Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I am trying to write a program that implements the Markov algorithm ... (two words) must appear in the sentence at some random fixed point.

Depending on what you mean by "random fixed point", you may have moved out-of-bounds with respect to the Markov algorithm.

For the benefit of those following along, the traditional Markov random-text generator samples text to (conceptually) build a big sparse matrix. Each cell in the matrix represents the probability of transitioning from one state to the next, where "state" is a tuple of the N adjacent tokens. For example, by sampling

the quick red fox jumped over the lazy perl hacker
with N=1, the matrix will show that the probability of transitioning from (the) to (quick) is 50%. By sampling some larger text with N=2, the matrix might show that the probability of transitioning from (Perl is) to (is neat) to be 90%, and to (is hard) to be 10%.

The transition matrix is easier to implement using a tree, but conceptually it's just a sparse matrix.

To generate random text, we just roll the dice and choose a legal transition.

Your problem complicates this considerably. By saying that you want two words A and B to appear in the sentence at random points, you require an algorithm that finds a legal set of transitions that first gets you to A (or B), and then eventually gets you to B (or A), and then eventually terminates with the end of a sentence. You could do this with a tree search, but there's no guarantee that it will terminate unless a valid sequence of paths existed in your source text. And depending on how you record sentence start and sentence end, there might not be.

You might try this. Start with A, run the algorithm backwards until you find a valid sentence start. To do this (assuming you're using a tree), you'll need to build a second tree to do backwards transitions. Then run the algorithm forward building a tree breadth-first (so that you can terminate at some arbitrary depth) looking for B. If you find it, run the algorithm forward until you find the end of then sentence (again using a breadth-first search).

And I recommend that you don't try to hold your breath while the algorithm runs. :)


In reply to Re: help with a new type of Markov by dws
in thread help with a new type of Markov by thealienz1

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (4)
As of 2024-04-24 19:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found