Re: 3D test data that exhibits clustering?
by moritz (Cardinal) on Jun 08, 2011 at 11:32 UTC
|
Or if you have thoughts on how to generate same?
You can generate a few triples of random numbers and use them as centers for the new cluster. Then for each center, you can generate a random number of points that are close.
For example you can use a gaussian distribution around the centers. For that you need normally distributed random numbers, which you can generate with the Box–Muller transform out of the uniformly distributed random numbers that perl's rand generates.
Depending on the data you want to emulate, you might also want to add a number of totally random, non-clustered points.
| [reply] |
Re: 3D test data that exhibits clustering?
by Eliya (Vicar) on Jun 08, 2011 at 11:41 UTC
|
And every attempt I tried so far to generate "random clusters" hasn't really worked either.
What exactly have you tried and why hasn't it worked?
The approach moritz suggests (or variations thereof) seems to be rather straightforward, and given the brilliance of mind you've often exhibited here, I cannot believe you haven't already thought of it... :) So, what's wrong with it?
| [reply] |
|
What exactly have you tried and why hasn't it worked?
I tried generating random points around a set of random starting points, (without moritz' enhancement of a normal distribution around those starting points), but it generates sets like these. (Color coded for start point.)
As you can see, you tend to either get very concentrated groupings very separate, or widely spread groups that almost entirely overlap. Neither is representative of the kind of plots you get from real datasets that exhibit clustering.
moritz' enhancement might improve things somewhat--I'm trying it now--but if there were one or a few real datasets kicking around somewhere it would give me more confidence that I was performing a real test.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
Re: 3D test data that exhibits clustering?
by salva (Canon) on Jun 08, 2011 at 15:47 UTC
|
Some stellar database?
For instance, a quick Google search reveals NOMAD. | [reply] |
|
| [reply] |
|
Plenty of star databases include distance estimates. They're based on trigonometric parallax, absolute-brightness phenomenon, redshift, etc.
Stellarium wouldn't be so much fun without it.
| [reply] |
|
|
I am sure that some databases include the distance, though maybe not with the resolution you need.
| [reply] |
Re: 3D test data that exhibits clustering?
by salva (Canon) on Jun 08, 2011 at 16:00 UTC
|
Some algorithm based on the Artificial Termites?
For instance, generate some random points in a 3D space. Then simulate some termites moving randomly over that space that can take a point when it is near enough and then leave it when there is another one near enough (maybe using some probability functions). | [reply] |
|
At least in 2D, it can generate nice results. Red and green dots are the termites, blue dots are the wood (the clustered points).
The program that generates it is available from Github here.
update: now, also available from CPAN as AI::Termites.
| [reply] |
|
| [reply] |
|