Chess Ratings Contest--Improving on ELO

hyperpape · Post by **hyperpape** » Fri Oct 01, 2010 10:38 am

There's a contest to try and improve the ELO rating system for Chess that's been going on for the past month or two or three. I don't know if it's been mentioned on the boards yet, but it might be of interest to people concerned about Go ratings.

Announcement: http://kaggle.com/chess
Update: http://kaggle.com/blog/2010/09/21/elo-v ... fway-mark/

shapenaji · Post by **shapenaji** » Fri Oct 01, 2010 11:09 am

I've been playing around with a distribution based approach to rating,

Start players out with a normal distribution about their rating, then when they play an opponent, you use the likelihood that rating of player A is greater than player B, as the win probability. Based on their result, their probability distribution is then adjusted (in a Bayesian fashion) based on the distribution of the player that they played.

Over time, players can develop skew distributions which can be better descriptors of a person's strength than a single value and a variance.

Still playing with the code, (trying to keep the update process conserving rating points is my only hang-up right now), but I think players might really appreciate the additional info about their play style that a distribution-based approach can give.

Bill Spight · Post by **Bill Spight** » Fri Oct 01, 2010 12:04 pm

shapenaji wrote:I've been playing around with a distribution based approach to rating,

Start players out with a normal distribution about their rating,

If the ratings are linear with rank, I do not think that the distribution is normal. You will have more players in the bottom half of the rank. (That will not be true with low ranks, because of people who drop out.

)

shapenaji · Post by **shapenaji** » Fri Oct 01, 2010 12:35 pm

It's probably not normal, but in line with the Bayesian approach, the distributions should gravitate toward a better looking distribution.

Later on, with more data, we could start players with typical distribution for players at that rank, but for now, with no other information, I feel a normal is the least biased prior.

palapiku · Post by **palapiku** » Fri Oct 01, 2010 1:24 pm

"Subsequent statistical tests have shown that chess performance is almost certainly not normally distributed. Weaker players have significantly greater winning chances than Elo's model predicts. Therefore, both the USCF and FIDE have switched to formulas based on the logistic distribution. However, in deference to Elo's contribution, both organizations are still commonly said to use "the Elo system"."

Presumably, the logistic distribution is the best prior to use.

Also, I don't think it's easy to modify the posterior distributions arbitrarily (changing their type) without grossly overfitting.

shapenaji · Post by **shapenaji** » Fri Oct 01, 2010 5:55 pm

I'll try starting folks out with a logistic, thanks for the tip.

As far as the gross overfitting... the way I do it now is as follows:
(EDIT: Fixed the description of A_dist, which was shockingly wrong

)

A_mean = player A's rating
B_mean = player B's rating
A_dist = a vector range about player mean (example, in increments of 0.1, (-1,-0.9.-0.8,....0,0.1,0.2...1))
B_dist = (ditto)

A_dist_mat = a matrix where every row is identical and is equal to A_dist
B_dist_mat = a matrix where every column is identical and is equal to B_dist

diff_dist_mat = (A_mean+A_dist_mat)-(B_mean+B_dist_mat) ## A_mean and B_mean are added to every element of the matrices respectively
prob_mat = A_dist_mat*B_dist_mat

Alright, now you have a matrix of differences and the matrix of their associated probabilities. Then, the probability of player A winning is the sum of all the probabilities whose corresponding difference is greater than 0.

Based on this probability, we choose to assign a number of points inversely proportional to the likelihood of the result. (May be inversed and square rooted... haven't decided this yet)

We take that number of points and distribute it to the values of the matrix which were "correct", divvied up by how probable that result was. Finally, we use that matrix to extract new distributions for A and B.

EDIT: Then with these new distributions, we find the new means.

Am I making a glaring error here? I might be....

Life In 19x19

Chess Ratings Contest--Improving on ELO

Chess Ratings Contest--Improving on ELO

Re: Chess Ratings Contest--Improving on ELO

Re: Chess Ratings Contest--Improving on ELO

Re: Chess Ratings Contest--Improving on ELO

Re: Chess Ratings Contest--Improving on ELO

Re: Chess Ratings Contest--Improving on ELO