Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
a probability of substituting any given position in the string with A C T G
Look at it as not only a project related to Perl since this posts an intersection of math and statistics (error rate, transition rates, transversion rates...etc), biology (simulating a molecular process of replication) and programming (implementing the computational algorithm using a programming language)

So, in each clone, what's the error (mutation) limit?, does the mutation position hold any significance? do you need to filter your genomes afterwards to select only those still-functional ones that were not corrupted by the mutations but where the mutation may present phenotypic differences?

My initial thought was to make a mutlidimensional array, where each successive round of replication would be represented as values in the new array dimension
If I understand you correctly, this type of array can be an arbitrarily-depth depending on the recursion of replication rounds, in Perl, it is not recommended you add or remove to an array while performing operations on it so you may have to initiate an array a-priori as indicated by the number of replications to be performed then populate that.

In light of the above, hidden markov models can use the transition probabilities while performing the mutations (maybe difficult when wanting to consider all the different cases out there), your best bet is to grab a statistician or to keep one handy. HMMs have been extensively used before in sequence alignment and detecting indels hence this scenario here is related.

A simpler approach is to use randomization to simulate mutations, check this discussion and its implementation or to consult the feasibility of using an already available solution.

UPDATE:Scanning the simulation program in the links I gave you; it seems to have a loose criteria on what base can be replaced with what so I followed another approach by introducing conditions on mutation types (favoring transitions rather than fatal transversions) and giving you control over the number of replication iterations then showing you the sequences aligned in proximity for good visuals

# a program that generates a 50 bp random DNA seq #and introduces 1 mutation per DNA replication use strict; use warnings; use Data::Dumper; #generating a 50 bp seq one base at a time. #you can read sets of seqs from a file instead my ($base,$virusGenome); my @nucleotide = ('a','t','g','c'); for(1..50){ $base = $nucleotide[rand(3)]; $virusGenome .=$base; } my @seqArray; #a hash table of the permissible mutations #no freqs rates, note the base case. my %mutationBounds=('a'=>'T','c'=>'G','g'=>'C','t'=>'A'); print "how many replications?"; chomp(my $rounds=<>); mutate(times=>$rounds, seq=>$virusGenome); sub mutate{ my %params=@_; print "original seq=> $params{seq}\n"; for my $i (0..$params{times}-1){ #convert seq from string to array @seqArray = split("",$params{seq}); #pick a base from a random position my $randPos = int(rand(length($params{seq}))); my $toReplace=$seqArray[$randPos]; #mutate based on %mutationBounds # assign to a seq the altered $virusGenome. my $seqMute = $params{seq}; substr ($seqMute, $randPos,1,$mutationBounds{$toReplace}); + print "mutant seq",$i+1,"=> ", $seqMute,"\n"; #which rep cycle is done. What was replaced where. #print "Cycle " ,$i+1, " $toReplace replaced w $mutationBounds +{$toReplace} @ $randPos\n"; } }; #print Dumper(\%mutationBounds); #print Dumper(\%params); #print Dumper(\@seqArray); ##OUTPUT hisham@bioslayer:~/Desktop$ perl how many replications?3 original seq=> gaataggaataatagggagattgtaagagagtgggttagtatagttaata mutant seq1=> gaataggaataatagggagattgtaagagagtgggttagtatagttaaAa mutant seq2=> gaTtaggaataatagggagattgtaagagagtgggttagtatagttaata mutant seq3=> gaataggaataataCggagattgtaagagagtgggttagtatagttaata

David R. Gergen said "We know that second terms have historically been marred by hubris and by scandal." and I am a two y.o. monk today :D, June,12th, 2011...

In reply to Re: Virus toy model by biohisham
in thread Virus toy model^2 by ZWcarp

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2022-01-22 22:02 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (63 votes). Check out past polls.