There's a contest to try and improve the ELO rating system for Chess that's been going on for the past month or two or three. I don't know if it's been mentioned on the boards yet, but it might be of interest to people concerned about Go ratings.
Announcement: http://kaggle.com/chess
Update: http://kaggle.com/blog/2010/09/21/elo-v ... fway-mark/
Chess Ratings Contest--Improving on ELO
- shapenaji
- Lives in sente
- Posts: 1103
- Joined: Tue Apr 20, 2010 10:58 pm
- Rank: EGF 4d
- GD Posts: 952
- Location: Netherlands
- Has thanked: 407 times
- Been thanked: 422 times
Re: Chess Ratings Contest--Improving on ELO
I've been playing around with a distribution based approach to rating,
Start players out with a normal distribution about their rating, then when they play an opponent, you use the likelihood that rating of player A is greater than player B, as the win probability. Based on their result, their probability distribution is then adjusted (in a Bayesian fashion) based on the distribution of the player that they played.
Over time, players can develop skew distributions which can be better descriptors of a person's strength than a single value and a variance.
Still playing with the code, (trying to keep the update process conserving rating points is my only hang-up right now), but I think players might really appreciate the additional info about their play style that a distribution-based approach can give.
Start players out with a normal distribution about their rating, then when they play an opponent, you use the likelihood that rating of player A is greater than player B, as the win probability. Based on their result, their probability distribution is then adjusted (in a Bayesian fashion) based on the distribution of the player that they played.
Over time, players can develop skew distributions which can be better descriptors of a person's strength than a single value and a variance.
Still playing with the code, (trying to keep the update process conserving rating points is my only hang-up right now), but I think players might really appreciate the additional info about their play style that a distribution-based approach can give.
Tactics yes, Tact no...
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Chess Ratings Contest--Improving on ELO
shapenaji wrote:I've been playing around with a distribution based approach to rating,
Start players out with a normal distribution about their rating,
If the ratings are linear with rank, I do not think that the distribution is normal. You will have more players in the bottom half of the rank. (That will not be true with low ranks, because of people who drop out.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
- shapenaji
- Lives in sente
- Posts: 1103
- Joined: Tue Apr 20, 2010 10:58 pm
- Rank: EGF 4d
- GD Posts: 952
- Location: Netherlands
- Has thanked: 407 times
- Been thanked: 422 times
Re: Chess Ratings Contest--Improving on ELO
It's probably not normal, but in line with the Bayesian approach, the distributions should gravitate toward a better looking distribution.
Later on, with more data, we could start players with typical distribution for players at that rank, but for now, with no other information, I feel a normal is the least biased prior.
Later on, with more data, we could start players with typical distribution for players at that rank, but for now, with no other information, I feel a normal is the least biased prior.
Tactics yes, Tact no...
- palapiku
- Lives in sente
- Posts: 761
- Joined: Sun Apr 25, 2010 11:25 pm
- Rank: the k-word
- GD Posts: 0
- Has thanked: 152 times
- Been thanked: 204 times
Re: Chess Ratings Contest--Improving on ELO
"Subsequent statistical tests have shown that chess performance is almost certainly not normally distributed. Weaker players have significantly greater winning chances than Elo's model predicts. Therefore, both the USCF and FIDE have switched to formulas based on the logistic distribution. However, in deference to Elo's contribution, both organizations are still commonly said to use "the Elo system"."
Presumably, the logistic distribution is the best prior to use.
Also, I don't think it's easy to modify the posterior distributions arbitrarily (changing their type) without grossly overfitting.
Presumably, the logistic distribution is the best prior to use.
Also, I don't think it's easy to modify the posterior distributions arbitrarily (changing their type) without grossly overfitting.
- shapenaji
- Lives in sente
- Posts: 1103
- Joined: Tue Apr 20, 2010 10:58 pm
- Rank: EGF 4d
- GD Posts: 952
- Location: Netherlands
- Has thanked: 407 times
- Been thanked: 422 times
Re: Chess Ratings Contest--Improving on ELO
I'll try starting folks out with a logistic, thanks for the tip.
As far as the gross overfitting... the way I do it now is as follows:
(EDIT: Fixed the description of A_dist, which was shockingly wrong
)
A_mean = player A's rating
B_mean = player B's rating
A_dist = a vector range about player mean (example, in increments of 0.1, (-1,-0.9.-0.8,....0,0.1,0.2...1))
B_dist = (ditto)
A_dist_mat = a matrix where every row is identical and is equal to A_dist
B_dist_mat = a matrix where every column is identical and is equal to B_dist
diff_dist_mat = (A_mean+A_dist_mat)-(B_mean+B_dist_mat) ## A_mean and B_mean are added to every element of the matrices respectively
prob_mat = A_dist_mat*B_dist_mat
Alright, now you have a matrix of differences and the matrix of their associated probabilities. Then, the probability of player A winning is the sum of all the probabilities whose corresponding difference is greater than 0.
Based on this probability, we choose to assign a number of points inversely proportional to the likelihood of the result. (May be inversed and square rooted... haven't decided this yet)
We take that number of points and distribute it to the values of the matrix which were "correct", divvied up by how probable that result was. Finally, we use that matrix to extract new distributions for A and B.
EDIT: Then with these new distributions, we find the new means.
Am I making a glaring error here? I might be....
As far as the gross overfitting... the way I do it now is as follows:
(EDIT: Fixed the description of A_dist, which was shockingly wrong
A_mean = player A's rating
B_mean = player B's rating
A_dist = a vector range about player mean (example, in increments of 0.1, (-1,-0.9.-0.8,....0,0.1,0.2...1))
B_dist = (ditto)
A_dist_mat = a matrix where every row is identical and is equal to A_dist
B_dist_mat = a matrix where every column is identical and is equal to B_dist
diff_dist_mat = (A_mean+A_dist_mat)-(B_mean+B_dist_mat) ## A_mean and B_mean are added to every element of the matrices respectively
prob_mat = A_dist_mat*B_dist_mat
Alright, now you have a matrix of differences and the matrix of their associated probabilities. Then, the probability of player A winning is the sum of all the probabilities whose corresponding difference is greater than 0.
Based on this probability, we choose to assign a number of points inversely proportional to the likelihood of the result. (May be inversed and square rooted... haven't decided this yet)
We take that number of points and distribute it to the values of the matrix which were "correct", divvied up by how probable that result was. Finally, we use that matrix to extract new distributions for A and B.
EDIT: Then with these new distributions, we find the new means.
Am I making a glaring error here? I might be....
Tactics yes, Tact no...