Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Sports Conference Rankings, Colley Matrix Style

by Zaxo (Archbishop)
on Oct 25, 2006 at 04:49 UTC ( #580484=CUFP: print w/replies, xml ) Need Help??

It's (US) football season now, and arguments over Strength of Schedule and the iniquity of zebras are heard across the land.

This is Matthew Colley's elegant method of calculating a probability-like ranking from the results of contests. I won't go into the mathematical details or properties of the method - there is a paper at Colley's site which gives that.

Colley uses this method to rank all Division-1A teams as part of the "computer" segment of the all-important BCS ratings. It is distinguished by its simplicity and lack of mystery tweaks.

The magnificent PDL module is ideal for carrying out these calculations. Here, I've applied them to intra-conference games only, to get a current conference ranking. The data is hard-coded in this simple version, with enough information in comments to let you replace it with your own favorite conference's results. It is more flexible and convenient to get the data by a database query or by web scraping.

#!/usr/bin/perl # -*-EPerl-*- use PDL; my @becteams = ( 'Pittsburgh', 'Louisville', 'Rutgers', 'West Virginia', 'South Florida', 'Connecticut', 'Syracuse', 'Cincinnati' ); # $C is the Colley Matrix. It depends only on the schedule # of games already played. Rows and columns are indexed in # the same order, by teams. The diagonal elements are the # number of games played plus two. Off-diagonals are zero # for no game yet played for the indexed teams, or minus one # for a game played. It contains nothing about the result of # the games. It's obviously a symmetric matrix. my $C = pdl([ # UP UL RU WV SF CT SU UC [ 5, 0,-1, 0, 0, 0,-1,-1], # Pittsburgh [ 0, 4, 0, 0, 0, 0,-1,-1], # Louisville [-1, 0, 4, 0,-1, 0, 0, 0], # Rutgers [ 0, 0, 0, 4, 0,-1,-1, 0], # West Virginia [ 0, 0,-1, 0, 5,-1, 0,-1], # South Florida [ 0, 0, 0,-1,-1, 4, 0, 0], # Connecticut [-1,-1, 0,-1, 0, 0, 5, 0], # Syracuse [-1,-1, 0, 0,-1, 0, 0, 5] # Cincinnati ]); # $wl is a column vector containing win and loss information. # For each team in the same order as $C is indexed, the value # is numerically 1 + (wins - losses)/2. my $wl = pdl([[ 3/2],[ 2 ],[ 2 ],[ 2 ],[ 1/2],[ 0 ],[-1/2],[ 1/2]]); # Pitt UL Rut WVU USF UConn SU Cincy my $c = $C->inv; my $r = $c x $wl; my %rating; @rating{@becteams} = list $r; { my $ct = 1; for (sort {$rating{$b}<=>$rating{$a}} keys %rating) { my $out = pack 'A4 A20 A6', $ct++, $_, sprintf '%5.4f', $ratin +g{$_}; print $out, $/; } } __END__ 1 Rutgers 0.7443 2 Louisville 0.6779 3 West Virginia 0.6339 4 Pittsburgh 0.5912 5 Cincinnati 0.4310 6 South Florida 0.3861 7 Syracuse 0.2806 8 Connecticut 0.2550

Congratulations to Rutgers, their higher rating for the same record as Louisville and West Virginia comes from having beaten tougher teams so far. In the Big East everybody plays everybody, so that advantage will level off by the end of the season. A conference which is too large to allow all-pairs play admits more interesting use of this method.

After Compline,

Replies are listed 'Best First'.
Re: Sports Conference Rankings, Colley Matrix Style
by zentara (Archbishop) on Oct 25, 2006 at 12:49 UTC
    I remember reading an article in Sport's Illustrated, about a statistician who applied this concept to horse racing. He meticulously compiled statistics on every horse and every race.... things like air temp, time of day, dry, muddy, etc. He could then predict with better than 50% accuracy, the results of horse races. He was in Las Vegas, living off of his bets. :-)

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://580484]
Approved by chargrill
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2023-11-30 03:23 GMT
Find Nodes?
    Voting Booth?

    No recent polls found