A Curious Case Study in KGS Ranks
-
Mef
- Lives in sente
- Posts: 852
- Joined: Fri Apr 23, 2010 8:34 am
- Rank: KGS [-]
- GD Posts: 428
- Location: Central Coast
- Has thanked: 201 times
- Been thanked: 333 times
A Curious Case Study in KGS Ranks
There are many complaints espoused here and elsewhere about the inability of KGS's rating system to satisfy the needs of edge-case users. Frequently these discussions are emotiionally charged with only vague references to unsourced anecdotes, while it would be my preference for them to be more data driven. A strange turn of events has occurred recently that have allowed for an interesting evaluation of KGS's rating system behavior under extreme circumstances. Not being one to pass up a chance for investigation, I posit to L19 a case study. Specifically what I feel it tests are the following two claims:
- If you play too many rated games is it possible for your rating to become "stuck" to the point where even large streaks cannot move your rank. (If you play too many games will it take a very long time for your rank to move.)
-Does KGS unneccessarily penalize losing streaks over winning streaks, to where players cannot advance due to having 1 bad day. (Does a losing streak "weigh you down" more than a winning streak can "bring you up").
The Details:
The bot GnuGo2 has played approximately 17,000 rated games in the last six months, averaging about a 41% win rate (41.7 if you remove the anomaly we're about to discuss). This places it firmly in the mid-to-lower 11k rating and makes it quite possibly as stable as any rank will ever be. Due to an unfortunate error in how the user running this bot had it implemented, in mid-March there was one day where the bot forfeited vritually all of its games, ultimately going 6-236 on the day. For your review I've attached a clipped version of the bot's rating graph for this year where the day in question is clearly visible.
To cover the highlights:
- Having 1 poor day (2.5% win rate) encompassing approximately 1.5% of the total games played in the 6 month period caused the bot's rating to drop about 1/5 of a stone (graph is only updated once / day so there is no finer resolution to use) in spite of having 17,000 games "anchoring" the rank.
-Upon being restored to "normal strength" the bot played 887 (~5% of total games played in the 6 months) games winning ~49% of them, and it took less than a week for the rank to essentially fully recover.
-The bot's winrate while being rated 1 stone lower than normal was ~57.5%, so nothing terribly extraordinary.
To me this suggests that even if you are an extreme edge case (I don't know of any human users who have managed 17,000 games in 6 months, in spite of how much many have tried), your rank is still mobile if you truly have statistically significant streaks. Further it suggests to me no matter how bad of a day you have (because this was basically the worst of bad days), it is not a particularly excessive burden to overcome (The rank was restored to normal without an excessively high win rate).
Thoughts?
- If you play too many rated games is it possible for your rating to become "stuck" to the point where even large streaks cannot move your rank. (If you play too many games will it take a very long time for your rank to move.)
-Does KGS unneccessarily penalize losing streaks over winning streaks, to where players cannot advance due to having 1 bad day. (Does a losing streak "weigh you down" more than a winning streak can "bring you up").
The Details:
The bot GnuGo2 has played approximately 17,000 rated games in the last six months, averaging about a 41% win rate (41.7 if you remove the anomaly we're about to discuss). This places it firmly in the mid-to-lower 11k rating and makes it quite possibly as stable as any rank will ever be. Due to an unfortunate error in how the user running this bot had it implemented, in mid-March there was one day where the bot forfeited vritually all of its games, ultimately going 6-236 on the day. For your review I've attached a clipped version of the bot's rating graph for this year where the day in question is clearly visible.
To cover the highlights:
- Having 1 poor day (2.5% win rate) encompassing approximately 1.5% of the total games played in the 6 month period caused the bot's rating to drop about 1/5 of a stone (graph is only updated once / day so there is no finer resolution to use) in spite of having 17,000 games "anchoring" the rank.
-Upon being restored to "normal strength" the bot played 887 (~5% of total games played in the 6 months) games winning ~49% of them, and it took less than a week for the rank to essentially fully recover.
-The bot's winrate while being rated 1 stone lower than normal was ~57.5%, so nothing terribly extraordinary.
To me this suggests that even if you are an extreme edge case (I don't know of any human users who have managed 17,000 games in 6 months, in spite of how much many have tried), your rank is still mobile if you truly have statistically significant streaks. Further it suggests to me no matter how bad of a day you have (because this was basically the worst of bad days), it is not a particularly excessive burden to overcome (The rank was restored to normal without an excessively high win rate).
Thoughts?
- Attachments
-
- Annotated Rank Graph
- GnuGo2.JPG (33.05 KiB) Viewed 16917 times
-
illluck
- Lives in sente
- Posts: 1223
- Joined: Sun Apr 25, 2010 5:07 am
- Rank: OGS 2d
- GD Posts: 0
- KGS: illluck
- Tygem: Trickprey
- OGS: illluck
- Has thanked: 736 times
- Been thanked: 239 times
Re: A Curious Case Study in KGS Ranks
That seems like a demonstration of immobile rank to me - 6:236 and only dropping a fifth of a stone is pretty ridiculous.
-
Mef
- Lives in sente
- Posts: 852
- Joined: Fri Apr 23, 2010 8:34 am
- Rank: KGS [-]
- GD Posts: 428
- Location: Central Coast
- Has thanked: 201 times
- Been thanked: 333 times
Re: A Curious Case Study in KGS Ranks
illluck wrote:That seems like a demonstration of immobile rank to me - 6:236 and only dropping a fifth of a stone is pretty ridiculous.
To put this in perspective, this is the equivalent to a normal player who plays 2 games /day having a 4 game losing streak in a day.
- Dante31
- Lives with ko
- Posts: 129
- Joined: Sat May 15, 2010 6:08 pm
- Rank: KGS 4k
- GD Posts: 0
- Has thanked: 5 times
- Been thanked: 14 times
Re: A Curious Case Study in KGS Ranks
Those who are willing to look at KGS ranks rationally know that kgs ranks do not get stuck. It's just that there are people that need something to blame for the fact that they are not progressing as fast they they would like.
-
RobertJasiek
- Judan
- Posts: 6273
- Joined: Tue Apr 27, 2010 8:54 pm
- GD Posts: 0
- Been thanked: 797 times
- Contact:
Re: A Curious Case Study in KGS Ranks
The case study does not compare well to human players with frequent games, who need, without significant interruption, to win ca. 70+% for weeks up to a few months in order to improve a rank, after it has been VERY MUCH easier to drop a rank.
The problem can already be observed when 1 loss demotes a rank, but the next 2 or 3 games won do not necessarily promote a rank.
For any rating system to be perceived fair, there must be symmetry in the difficulties of decreasing and increasing one's rating. The KGS system lacks such a symmetry.
The problem can already be observed when 1 loss demotes a rank, but the next 2 or 3 games won do not necessarily promote a rank.
For any rating system to be perceived fair, there must be symmetry in the difficulties of decreasing and increasing one's rating. The KGS system lacks such a symmetry.
-
Mef
- Lives in sente
- Posts: 852
- Joined: Fri Apr 23, 2010 8:34 am
- Rank: KGS [-]
- GD Posts: 428
- Location: Central Coast
- Has thanked: 201 times
- Been thanked: 333 times
Re: A Curious Case Study in KGS Ranks
RobertJasiek wrote:The case study does not compare well to human players with frequent games, who need, without significant interruption, to win ca. 70+% for weeks up to a few months in order to improve a rank, after it has been VERY MUCH easier to drop a rank.
The problem can already be observed when 1 loss demotes a rank, but the next 2 or 3 games won do not necessarily promote a rank.
For any rating system to be perceived fair, there must be symmetry in the difficulties of decreasing and increasing one's rating. The KGS system lacks such a symmetry.
This has never been documented, only alluded to in unsupported anecdote that falls apart whenever data is collected. In fact, you personally were used as an example in a previous case study to demonstrate that this effect doesn't exist!
Edit: My apologies, I should have said: Two previous case studies
-
RobertJasiek
- Judan
- Posts: 6273
- Joined: Tue Apr 27, 2010 8:54 pm
- GD Posts: 0
- Been thanked: 797 times
- Contact:
Re: A Curious Case Study in KGS Ranks
1) I have experienced my described rating / ranking behaviour for myself several (not only one, as you suggest) times.
2) Your linked case studies might be used for OTHER arguments (such as that I do not permanently win 70% of my KGS games, e.g., because(!!!) it is by far too frustrating to maintain a winning attitude when affected by the mentioned experience and continue playing only when not tired), but they do not refute my made experience.
3) I have heard from (or watched) several people that they have made similar experiences.
4) Since the effects have been experienced, they DO exist. (And no, I have not bothered to protocol them. I have better uses for my time.)
2) Your linked case studies might be used for OTHER arguments (such as that I do not permanently win 70% of my KGS games, e.g., because(!!!) it is by far too frustrating to maintain a winning attitude when affected by the mentioned experience and continue playing only when not tired), but they do not refute my made experience.
3) I have heard from (or watched) several people that they have made similar experiences.
4) Since the effects have been experienced, they DO exist. (And no, I have not bothered to protocol them. I have better uses for my time.)
- RBerenguel
- Gosei
- Posts: 1585
- Joined: Fri Nov 18, 2011 11:44 am
- Rank: KGS 5k
- GD Posts: 0
- KGS: RBerenguel
- Tygem: rberenguel
- Wbaduk: JohnKeats
- Kaya handle: RBerenguel
- Online playing schedule: KGS on Saturday I use to be online, but I can be if needed from 20-23 GMT+1
- Location: Barcelona, Spain (GMT+1)
- Has thanked: 576 times
- Been thanked: 298 times
- Contact:
Re: A Curious Case Study in KGS Ranks
RobertJasiek wrote:4) Since the effects have been experienced, they DO exist. (And no, I have not bothered to protocol them. I have better uses for my time.)
¿¿?? Robert, you are a mathematician. Come on!
Geek of all trades, master of none: the motto for my blog mostlymaths.net
-
RobertJasiek
- Judan
- Posts: 6273
- Joined: Tue Apr 27, 2010 8:54 pm
- GD Posts: 0
- Been thanked: 797 times
- Contact:
Re: A Curious Case Study in KGS Ranks
A fix for the rating system? Easy, use a different system:
- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.
- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.
- RBerenguel
- Gosei
- Posts: 1585
- Joined: Fri Nov 18, 2011 11:44 am
- Rank: KGS 5k
- GD Posts: 0
- KGS: RBerenguel
- Tygem: rberenguel
- Wbaduk: JohnKeats
- Kaya handle: RBerenguel
- Online playing schedule: KGS on Saturday I use to be online, but I can be if needed from 20-23 GMT+1
- Location: Barcelona, Spain (GMT+1)
- Has thanked: 576 times
- Been thanked: 298 times
- Contact:
Re: A Curious Case Study in KGS Ranks
RobertJasiek wrote:A fix for the rating system? Easy, use a different system:
- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.
I'm tempted to run a Monte Carlo simulation of such a system. Maybe I'll do, could be fun.
Geek of all trades, master of none: the motto for my blog mostlymaths.net
-
Charles Alden
- Beginner
- Posts: 10
- Joined: Mon Jul 01, 2013 5:00 pm
- Rank: AGA 2 dan
- GD Posts: 0
- Been thanked: 6 times
Re: A Curious Case Study in KGS Ranks
Easy, use a different system:
- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.[/quote]
I'm tempted to run a Monte Carlo simulation of such a system. Maybe I'll do, could be fun.[/quote]
Under which system, in Mef's example the bot's rating would have moved to 34k the following day?
- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.[/quote]
I'm tempted to run a Monte Carlo simulation of such a system. Maybe I'll do, could be fun.[/quote]
Under which system, in Mef's example the bot's rating would have moved to 34k the following day?
- HermanHiddema
- Gosei
- Posts: 2011
- Joined: Tue Apr 20, 2010 10:08 am
- Rank: Dutch 4D
- GD Posts: 645
- Universal go server handle: herminator
- Location: Groningen, NL
- Has thanked: 202 times
- Been thanked: 1086 times
Re: A Curious Case Study in KGS Ranks
RobertJasiek wrote:A fix for the rating system? Easy, use a different system:
- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.
Which is deflationary. Every 20k that enters the system and moves up to 1d has removed 20 ranks total from the other players. That's no problem in a small playing pool like a club, where I think this kind of system is fine, as you can just manually recalibrate all ranks every once in a while, but on a go server it is unsuitable.
In a deflationary system, playing more games means you lose rating quicker. So you're replacing "My rating is stuck because I play so much" with "My rating keeps dropping because I play so much". How is that better?
-
Pippen
- Lives in gote
- Posts: 677
- Joined: Thu Sep 16, 2010 3:34 pm
- GD Posts: 0
- KGS: 2d
- Has thanked: 6 times
- Been thanked: 31 times
Re: A Curious Case Study in KGS Ranks
I am a 5D-Tygem and 1d-KGS. From my experience with Tygem I can say: Ranks at KGS are more stable and consistent than Tygem's. On Tygem you will find some more differences within one rank. Sometimes you play guys that seem like 1-2 stones weaker, sometimes 1-2 stronger, but all have the same rank. But here comes the advantage of such a thing: It's more fun, you have faster chances to get promoted/demoted and to play stronger players that wouldn't play you otherwise. KGS ranking is sounder, but more boring and since ranking is maybe the main motivation to play and stay in Go, it's significant.
I'd like KGS to copy Tygem's ranking system, i.e. a system of x-game series where you get promoted when you win y games and demoted when you lose z games out of it.
I'd like KGS to copy Tygem's ranking system, i.e. a system of x-game series where you get promoted when you win y games and demoted when you lose z games out of it.
-
uPWarrior
- Lives with ko
- Posts: 199
- Joined: Mon Jan 17, 2011 1:59 pm
- Rank: KGS 3 kyu
- GD Posts: 0
- Has thanked: 6 times
- Been thanked: 55 times
Re: A Curious Case Study in KGS Ranks
It's funny how Robert just proposed removing all handicap games from the calculation while in a different topic I proposed that only handicap games should be considered so we don't rely on arbitrary win percentages.
-
Mef
- Lives in sente
- Posts: 852
- Joined: Fri Apr 23, 2010 8:34 am
- Rank: KGS [-]
- GD Posts: 428
- Location: Central Coast
- Has thanked: 201 times
- Been thanked: 333 times
Re: A Curious Case Study in KGS Ranks
Pippen wrote:I am a 5D-Tygem and 1d-KGS. From my experience with Tygem I can say: Ranks at KGS are more stable and consistent than Tygem's. On Tygem you will find some more differences within one rank. Sometimes you play guys that seem like 1-2 stones weaker, sometimes 1-2 stronger, but all have the same rank. But here comes the advantage of such a thing: It's more fun, you have faster chances to get promoted/demoted and to play stronger players that wouldn't play you otherwise. KGS ranking is sounder, but more boring and since ranking is maybe the main motivation to play and stay in Go, it's significant.
I'd like KGS to copy Tygem's ranking system, i.e. a system of x-game series where you get promoted when you win y games and demoted when you lose z games out of it.
KGS's rating system aims to provide the most accurate rank it can with all data available. It aims to do the best job of predicting the probable outcome between any two players and any handicap (though in practice it only accepts feedback from games H6 or less).
Tygem's rating system does not make any predictions. It does not handle handicap games. It does not make any attempt to ensure proper rank spacing. It suffers from large amounts of noise being introduced by players setting their own ranks. Under an ideal set of assumptions (all ranks properly spaced, all players properly ranked, etc) you still expect to spend 30% of your time at the wrong rank. Tygem's rating system has a place in the go world and many people find it fun. Accurately assessing your go strength and comparing yourself on a fixed scale to a pool of larger players isn't it.