Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^2: Iteration speed

by jepri (Parson)
on Jun 16, 2004 at 13:05 UTC ( #367218=note: print w/replies, xml ) Need Help??

in reply to Re: Iteration speed
in thread Iteration speed

Oh, there's a few of us around :)

The problem, as noted by others, is that we can't see his code to make suggestions. Shrug. Can't help much there. He doesn't even say if he's using the Perl bioinformatics modules or if he's rolled his own.

In any case though, this is a problem that is begging for a parallel processing solution. In general, I'd recommend he break up the dataset and run it on all the machines in the lab. I doubt that there are many algorythmic improvements that can beat adding another 5 CPUs to the task.

I didn't believe in evil until I dated it.

Replies are listed 'Best First'.
Re^3: Iteration speed
by BrowserUk (Patriarch) on Jun 16, 2004 at 13:23 UTC

    I know there are a few of you guys around, but the description left me (and a few others from the responses) completely cold :)

    Belatedly, I have begin to think that this problem is related to a previous thread. If that is the case, I think that an algorithmic approach similar to that I outlined at Re: Re: Re: Processing data with lot of math... could cut the processing times to a fraction of a brute force iteration. As I mentioned in that post, my crude testing showed that by limiting the comparisons to a fraction of the possibles using an intelligent search I can process 100,000 coordinates and find 19000 matching pairs in around 4 minutes without trying very hard to optimise.

    I agree that a distributed search would perform the same task more quickly but the additional complexity of setting up that kind of system is best avoided if it can be. And if this is the same problem, that is easily possible. My test code from the previous thread is under 60 lines including the benchmarking.

    What stops me offering that code here is 1) A clear decription of the problem. 2) Some real test data.

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://367218]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2023-03-22 05:54 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (60 votes). Check out past polls.