Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Picking (potentially) winning lottery numbers

by scain (Curate)
on Jul 24, 2001 at 17:46 UTC ( [id://99328]=CUFP: print w/replies, xml ) Need Help??

Hello all,

This is my first attempt to place real code on the site. I wanted to learn how to use HTML::TableExtract and Date::Calc, so I thought this would be a good way of trying it out. This script hits the Ohio Lottery search site and gets all of the past Super Lotto results and determines which numbers have occurred most commonly and least commonly. I am not so silly as to think this would actually work for finding winning lottery numbers (or I would not publish the code and would by now be a millionaire), but it is an interesting diversion nonetheless.

I hope you enjoy it,
Scott


#!/usr/bin/perl -w use strict; # # OHlottery.pl By Scott Cain # # Uses CPAN modules to contact the Ohio Lottery web site search page # and extracts the winning numbers for all super lotto drawings for w +hich # there is data, which starts at the beginning of 1998. The frequenc +y of # each number's occurrence is calculated and compared to the expected # value, i.e., 1/47 or 1/49 depending on when the drawing was held. # # Things that could be done to make this better, but that I probably # won't do: # - have the results saved to a flat file so the script doesn't hav +e # to hit the web server for every past result. # - make it more general to allow analysis of other games. # use LWP::Simple qw(get); use HTML::TableExtract; use Date::Calc qw(Day_of_Week Add_Delta_Days Today Date_to_Days Decode_Day_of_Week); my @start_date = qw/1998 1 3/; # a Saturday, the first for which da +ta is available. my @current_date = @start_date; my $current_dow = Day_of_Week(@current_date); my $delta_days; my %lotto; my %lottoplus; my $lotto_count = 0; my $lottoplus_count = 0; # # properly initialize hashes # for my $i ( 1 .. 49 ) { $lotto{$i} = 0; $lottoplus{$i} = 0; } while ( Date_to_Days(@current_date) < Date_to_Days( Today() ) ) { my $URL = "http://www.ohiolottery.com/numbers/searchresults_bydate.asp?Fro +mMonth=" . $current_date[1] . "&FromDay=" . $current_date[2] . "&FromYear +=" . $current_date[0]; my $page = get($URL); if ( $page ) { my $te = new HTML::TableExtract( depth => 1 ); $te->parse($page); foreach my $ts ( $te->table_states ) { foreach my $row ( $ts->rows ) { my @game_str = grep ( /lotto/i, @$row ); if ( $game_str[0] and $game_str[0] =~ /plus/i and #differentiates old f +rom new lotto $$row[3] =~ /(\d+)-(\d+)-(\d+)-(\d+)-(\d+)-(\d+)/ + ) { $lottoplus{$1}++; $lottoplus{$2}++; $lottoplus{$3}++; $lottoplus{$4}++; $lottoplus{$5}++; $lottoplus{$6}++; $lottoplus_count++; } elsif ( $game_str[0] and $$row[3] =~ /(\d+)-(\d+)-(\d+)-(\d+)-(\d+)-(\d ++)/) { $lotto{$1}++; $lotto{$2}++; $lotto{$3}++; $lotto{$4}++; $lotto{$5}++; $lotto{$6}++; $lotto_count++; } } } } else { # the LWP get didn't work print "Date_to_Text(@current_date) failed\n"; } # # figure out the next day to use. # (drawings are only held on Saturdays and Wednesdays) # $current_dow = Day_of_Week(@current_date); if ( $current_dow == Decode_Day_of_Week("Saturday") ) { $delta_days = 4; } else { $delta_days = 3; } @current_date = Add_Delta_Days( @current_date, $delta_days ); } # closes while(date) loop # # now do some simple statistics. # my $num_lottoplus_balls = 6 * $lottoplus_count; my $num_lotto_balls = 6 * $lotto_count; for my $i ( 1 .. 49 ) { $lottoplus{$i} = $lottoplus{$i} / $num_lottoplus_balls; $lotto{$i} = $lotto{$i} / $num_lotto_balls; } my @toptobottom = sort { $lottoplus{$b} <=> $lottoplus{$a} } keys %lot +toplus; # # print fairly pretty results-- # probably could have used a sub here, but cut & paste is so convienen +t. # my $expect = 1.0 / 49.0; print 'Percentages are % deviation from expected value'; print "\nTop 8 superlotto plus balls\n"; print "for $lottoplus_count drawings\n"; for my $i ( 0 .. 7 ) { my $deviation = 100 * ( $lottoplus{ $toptobottom[$i] } - $expect ) + / $expect; printf( "%.2d -> %+.1f%%\n", $toptobottom[$i], $deviation ); } print "\nBottom 8 superlotto plus balls\n"; for my $i ( 41 .. 48 ) { my $deviation = 100 * ( $lottoplus{ $toptobottom[$i] } - $expect ) + / $expect; printf( "%.2d -> %+.1f%%\n", $toptobottom[$i], $deviation ); } @toptobottom = sort { $lotto{$b} <=> $lotto{$a} } keys %lotto; $expect = 1.0 / 47.0; print "\n\nOld Super Lotto results included for historical comparison. +\n"; print "\nTop 8 superlotto balls\n"; print "for $lotto_count drawings\n"; for my $i ( 0 .. 7 ) { my $deviation = 100 * ( $lotto{ $toptobottom[$i] } - $expect ) / $ +expect; printf( "%.2d -> %+.1f%%\n", $toptobottom[$i], $deviation ); } print "\nBottom 8 superlotto balls\n"; for my $i ( 39 .. 46 ) { my $deviation = 100 * ( $lotto{ $toptobottom[$i] } - $expect ) / $ +expect; printf( "%.2d -> %+.1f%%\n", $toptobottom[$i], $deviation ); }

Replies are listed 'Best First'.
Re: Picking (potentially) winning lottery numbers
by grinder (Bishop) on Jul 24, 2001 at 18:38 UTC
    Nice hack! This begs the question, though, what is the optimal choice in light of what other people will do? If it turns out that some numbers are more likely to occur than others, then rational people will choose those numbers. So if more people do choose those numbers, the pot will be split amongst a greater number of players, thus you will win less. So maybe you're better off choosing numbers that come up less frequently. Ah, but hang on, another rational choice is to choose the numbers that come up less frequently, because their turn has gotta come soon. So more people will choose those numbers, leading to smaller payoffs to you when you win.

    The fallacy at the root of these kinds of arguments is that the balls have a notion of what went before. If ball 32 hasn't come up in the 276 previous games, that doesn't mean it's more or less likely to come up than any other ball in the next game. They have no history of what went before.

    Which is to say that if you want to play Lotto (a.k. Fool's Tax) and write a script to pick Lotto numbers for you, start with a truly random source. I recall reading some time ago that the "autopick" selection of some country, where you pay and the computer fills the coupons with random numbers for you, was flawed. The numbers it chose were not really random. The numbers failed the Kruskal-Wallis test, Mann-Whitney test, or some damn thing. As the numbers were not truly random, and the actual number selection was, well, those autopick coupons just didn't win as often. Chance? Design?

    update: cleaned up the prose in a couple of spots -- ahhh, proofreading.

    Random numbers should not be generated with a method chosen at random. Some theory should be used. -- Donald Knuth.

    --
    g r i n d e r
      g r i n d e r ,

      Thanks. All of what you say is largely true, although one thing that was picking at the back of my brain was what if the balls do have some history. For instance, what if the paint used to put numbers on the balls resulted in some balls being (slightly) heavier than others? Presumably then, with enough tests, it would become apparent which balls were more likely to come out. (I don't know how they are actually picked from the machine but presumibly gravity is involved somehow.) Anyway, I don't really buy that arguement, and I think there would have to be too many trials to determine if it were true.

      But it is amuzing that you could make equally rational sounding aurguments for picking most common and least common balls; that's why I display both :-)

      Scott

        You can make a better argument for not putting money into the lottery.

        But that said, if you flip a US penny, there is a slight bias towards coming up heads. Not enough of one in practice to be useful though. (It is easiest to see if you balance the pennies on end and then whack the table.)

        In the UK, at least, there are multiple sets of balls, and multiple machines to pick them and balls are replaced frequently. This all stops trends developing, and making the results as random as ever.

        I'd be interested in it being able to calculate my chances of winning anything, and doing some manipulation with that (like what if I choose two sets of numbers for a week.)

        --
        RatArsed

      As the numbers were not truly random, and the actual number selection was, well, those autopick coupons just didn't win as often. Chance? Design?

      Whilst I am as much of a conspiracy theorist as the next man this statement flies in the face of what you just said. Any valid selected numbers (random or not) have the *same* chance of winning. 1-2-3-4-5-6 is not random but has the same chance of winning as 14-36-21-1-7-8

      BTW most lotteries drawn with balls are not random. The weight of the paint for the numbers and other things that lead to differences between balls skew the results. Unfortunately the skew is just not enough to be able to profit from.

      The famous story of breaking the bank at Monte Carlo was supposedly true and due to imbalance of the roulette wheel. The advantage in roulette it that the house's margin is ~3% so you don't need too much skew to screw that.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://99328]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-25 13:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found