Do you know where your variables are? | |
PerlMonks |
Re^5: Is it possible to make reference to substrings and sort them?by BrowserUk (Patriarch) |
on Mar 23, 2015 at 05:40 UTC ( [id://1120937]=note: print w/replies, xml ) | Need Help?? |
Suffix trie is not efficient in terms of memory, as the trie needs to store all suffixes from 1 to length(text) Hm. Maybe I'm misreading the paper, but my reading suggests that constructing a BWT the classical method -- construct all the rotations and then sort them -- takes quadratic time, which is why the normal practice is now to construct a prefix trie first; and the BWT from that (to save time). And further, the problem of suffux tries being inefficient of space ("nlog2n bits of working space which amounts to 12 GB for human genome") is fixed by Hon et al. (2007) which requires "only requires <1 GB memory at peak time for constructing the BWT of human genome". The algorithm shown in Figure 2 is quadratic in time and space. However, this is not necessary. In practice, we usually construct the suffix array first and then generate BWT. Most algorithms for constructing suffix array require at least nlog2n bits of working space, which amounts to 12 GB for human genome. Recently, Hon et al. (2007) gave a new algorithm that uses n bits of working space and only requires <1 GB memory at peak time for constructing the BWT of human genome . This algorithm is implemented in BWT-SW (Lam et al., 2008). We adapted its source code to make it work with BWA. Anyway. How many mismatches do you allow and still consider a site to match one of your 20nt short reads? I have code that can find all the 1-mismatch sites for each of 10,000 x 20nt short reads. in a 5Mnt strand, in 1.5 seconds using 1/4GB memory:
Allowing two mismatches per requires 6 1/2 minutes :
And allowing 3 requires 30minutes:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
In Section
Seekers of Perl Wisdom
|
|