Efficient ways of storing a data set for random access

dthacker has asked for the wisdom of the Perl Monks concerning the following question:

I'm not lazy enough. Consider the following case. I'm working on a soccer simulation that assigns players a score for each of 4 attributes. For example midfielders will have a passing attribute between 6 and 21 , shooting attribute between 6 and 12, and so on. I want to tweak the distribution to a "bell shaped curve" so that most of the players are assigned a score in the midpoint of the range, and fewer are assigned a high or low score. My solution was to create a file with 100 values like this:

lines with score of 6 3
score of 7 5
score of 8 6
...
score of 12 13
...
score of 21 2

I read the file into an array, generate a random number, and assign the value in @array[$rand] to a player.

There must be an easier and more perlish way to do this, but I'm having trouble visualizing a data structure that would do it. Any suggestions?

Dave

Code On!

Edit by BazB, fix sig div tag.

Comment on Efficient ways of storing a data set for random access Download Code

Replies are listed 'Best First'.
Re: Efficient ways of storing a data set for random access by Zaxo (Archbishop) on Jan 29, 2004 at 03:45 UTC
You can get random numbers with a distribution function from Math::Random. The `&Math::Random::random_normal` function is exported by default. It takes up to three arguments, the number of samples to generate, their mean and their standard deviation. It will generate a single sample in scalar context. After Compline, Zaxo	[reply]
Re: Efficient ways of storing a data set for random access by l3nz (Friar) on Jan 29, 2004 at 12:35 UTC
Your approach is not bad at all; you are trading memory versus a much higher access speed than the actual generation might require. The use of a data file allows for simple tweaking of the data distribuition just by altering the number of elements in the file. I'd extend it by randomizing based on the actual number of item read from file, and I'd load the file at startup without touching it anymore; this way your program will be fast. Or of course you could use one of the statistical modules...	[reply]
Re: Efficient ways of storing a data set for random access by Roy Johnson (Monsignor) on Jan 29, 2004 at 16:19 UTC
A sum of random numbers will be more bell-shaped than a single random number. Hence, rolling 2 6-sided dice will yield more sevens on average than rolling an 11-sided die numbered from 2 to 12. The more dice, the more heavily weighted toward the center of the range you'll be. For your 6-21 example, there are 16 numbers that need to be covered. You could roll two 16-sided dice, add them together, and divide by two to get a center-weighted result in the desired range. Here's a little program to demonstrate the distribution: `my $range=16; my $low_end=6; my %freq = (); for (1..1000) { my $result = int((rand($range)+rand($range))/2)+$low_end; ++$freq{$result}; } print "$_: $freq{$_}\n" for (sort {$a<=>$b} keys %freq); __END__ 6: 10 7: 35 8: 44 9: 52 10: 61 11: 77 12: 107 13: 109 14: 118 15: 111 16: 95 17: 46 18: 49 19: 57 20: 24 21: 5` [download] The PerlMonk `tr///` Advocate	[reply] [d/l]


Your skill will accomplish what the force of many cannot
	PerlMonks