A Curious Case Study in KGS Ranks

Comments, questions, rants, etc, that are specifically about KGS go here.
Mef
Lives in sente
Posts: 852
Joined: Fri Apr 23, 2010 8:34 am
Rank: KGS [-]
GD Posts: 428
Location: Central Coast
Has thanked: 201 times
Been thanked: 333 times

A Curious Case Study in KGS Ranks

Post by Mef »

There are many complaints espoused here and elsewhere about the inability of KGS's rating system to satisfy the needs of edge-case users. Frequently these discussions are emotiionally charged with only vague references to unsourced anecdotes, while it would be my preference for them to be more data driven. A strange turn of events has occurred recently that have allowed for an interesting evaluation of KGS's rating system behavior under extreme circumstances. Not being one to pass up a chance for investigation, I posit to L19 a case study. Specifically what I feel it tests are the following two claims:

- If you play too many rated games is it possible for your rating to become "stuck" to the point where even large streaks cannot move your rank. (If you play too many games will it take a very long time for your rank to move.)

-Does KGS unneccessarily penalize losing streaks over winning streaks, to where players cannot advance due to having 1 bad day. (Does a losing streak "weigh you down" more than a winning streak can "bring you up").



The Details:

The bot GnuGo2 has played approximately 17,000 rated games in the last six months, averaging about a 41% win rate (41.7 if you remove the anomaly we're about to discuss). This places it firmly in the mid-to-lower 11k rating and makes it quite possibly as stable as any rank will ever be. Due to an unfortunate error in how the user running this bot had it implemented, in mid-March there was one day where the bot forfeited vritually all of its games, ultimately going 6-236 on the day. For your review I've attached a clipped version of the bot's rating graph for this year where the day in question is clearly visible.

To cover the highlights:

- Having 1 poor day (2.5% win rate) encompassing approximately 1.5% of the total games played in the 6 month period caused the bot's rating to drop about 1/5 of a stone (graph is only updated once / day so there is no finer resolution to use) in spite of having 17,000 games "anchoring" the rank.

-Upon being restored to "normal strength" the bot played 887 (~5% of total games played in the 6 months) games winning ~49% of them, and it took less than a week for the rank to essentially fully recover.

-The bot's winrate while being rated 1 stone lower than normal was ~57.5%, so nothing terribly extraordinary.


To me this suggests that even if you are an extreme edge case (I don't know of any human users who have managed 17,000 games in 6 months, in spite of how much many have tried), your rank is still mobile if you truly have statistically significant streaks. Further it suggests to me no matter how bad of a day you have (because this was basically the worst of bad days), it is not a particularly excessive burden to overcome (The rank was restored to normal without an excessively high win rate).


Thoughts?
Attachments
Annotated Rank Graph
Annotated Rank Graph
GnuGo2.JPG (33.05 KiB) Viewed 16907 times
illluck
Lives in sente
Posts: 1223
Joined: Sun Apr 25, 2010 5:07 am
Rank: OGS 2d
GD Posts: 0
KGS: illluck
Tygem: Trickprey
OGS: illluck
Has thanked: 736 times
Been thanked: 239 times

Re: A Curious Case Study in KGS Ranks

Post by illluck »

That seems like a demonstration of immobile rank to me - 6:236 and only dropping a fifth of a stone is pretty ridiculous.
Mef
Lives in sente
Posts: 852
Joined: Fri Apr 23, 2010 8:34 am
Rank: KGS [-]
GD Posts: 428
Location: Central Coast
Has thanked: 201 times
Been thanked: 333 times

Re: A Curious Case Study in KGS Ranks

Post by Mef »

illluck wrote:That seems like a demonstration of immobile rank to me - 6:236 and only dropping a fifth of a stone is pretty ridiculous.


To put this in perspective, this is the equivalent to a normal player who plays 2 games /day having a 4 game losing streak in a day.
User avatar
Dante31
Lives with ko
Posts: 129
Joined: Sat May 15, 2010 6:08 pm
Rank: KGS 4k
GD Posts: 0
Has thanked: 5 times
Been thanked: 14 times

Re: A Curious Case Study in KGS Ranks

Post by Dante31 »

Those who are willing to look at KGS ranks rationally know that kgs ranks do not get stuck. It's just that there are people that need something to blame for the fact that they are not progressing as fast they they would like.
RobertJasiek
Judan
Posts: 6273
Joined: Tue Apr 27, 2010 8:54 pm
GD Posts: 0
Been thanked: 797 times
Contact:

Re: A Curious Case Study in KGS Ranks

Post by RobertJasiek »

The case study does not compare well to human players with frequent games, who need, without significant interruption, to win ca. 70+% for weeks up to a few months in order to improve a rank, after it has been VERY MUCH easier to drop a rank.

The problem can already be observed when 1 loss demotes a rank, but the next 2 or 3 games won do not necessarily promote a rank.

For any rating system to be perceived fair, there must be symmetry in the difficulties of decreasing and increasing one's rating. The KGS system lacks such a symmetry.
Mef
Lives in sente
Posts: 852
Joined: Fri Apr 23, 2010 8:34 am
Rank: KGS [-]
GD Posts: 428
Location: Central Coast
Has thanked: 201 times
Been thanked: 333 times

Re: A Curious Case Study in KGS Ranks

Post by Mef »

RobertJasiek wrote:The case study does not compare well to human players with frequent games, who need, without significant interruption, to win ca. 70+% for weeks up to a few months in order to improve a rank, after it has been VERY MUCH easier to drop a rank.

The problem can already be observed when 1 loss demotes a rank, but the next 2 or 3 games won do not necessarily promote a rank.

For any rating system to be perceived fair, there must be symmetry in the difficulties of decreasing and increasing one's rating. The KGS system lacks such a symmetry.



This has never been documented, only alluded to in unsupported anecdote that falls apart whenever data is collected. In fact, you personally were used as an example in a previous case study to demonstrate that this effect doesn't exist!

Edit: My apologies, I should have said: Two previous case studies
RobertJasiek
Judan
Posts: 6273
Joined: Tue Apr 27, 2010 8:54 pm
GD Posts: 0
Been thanked: 797 times
Contact:

Re: A Curious Case Study in KGS Ranks

Post by RobertJasiek »

1) I have experienced my described rating / ranking behaviour for myself several (not only one, as you suggest) times.

2) Your linked case studies might be used for OTHER arguments (such as that I do not permanently win 70% of my KGS games, e.g., because(!!!) it is by far too frustrating to maintain a winning attitude when affected by the mentioned experience and continue playing only when not tired), but they do not refute my made experience.

3) I have heard from (or watched) several people that they have made similar experiences.

4) Since the effects have been experienced, they DO exist. (And no, I have not bothered to protocol them. I have better uses for my time.)
User avatar
RBerenguel
Gosei
Posts: 1585
Joined: Fri Nov 18, 2011 11:44 am
Rank: KGS 5k
GD Posts: 0
KGS: RBerenguel
Tygem: rberenguel
Wbaduk: JohnKeats
Kaya handle: RBerenguel
Online playing schedule: KGS on Saturday I use to be online, but I can be if needed from 20-23 GMT+1
Location: Barcelona, Spain (GMT+1)
Has thanked: 576 times
Been thanked: 298 times
Contact:

Re: A Curious Case Study in KGS Ranks

Post by RBerenguel »

RobertJasiek wrote:4) Since the effects have been experienced, they DO exist. (And no, I have not bothered to protocol them. I have better uses for my time.)


¿¿?? Robert, you are a mathematician. Come on!
Geek of all trades, master of none: the motto for my blog mostlymaths.net
RobertJasiek
Judan
Posts: 6273
Joined: Tue Apr 27, 2010 8:54 pm
GD Posts: 0
Been thanked: 797 times
Contact:

Re: A Curious Case Study in KGS Ranks

Post by RobertJasiek »

A fix for the rating system? Easy, use a different system:

- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.
User avatar
RBerenguel
Gosei
Posts: 1585
Joined: Fri Nov 18, 2011 11:44 am
Rank: KGS 5k
GD Posts: 0
KGS: RBerenguel
Tygem: rberenguel
Wbaduk: JohnKeats
Kaya handle: RBerenguel
Online playing schedule: KGS on Saturday I use to be online, but I can be if needed from 20-23 GMT+1
Location: Barcelona, Spain (GMT+1)
Has thanked: 576 times
Been thanked: 298 times
Contact:

Re: A Curious Case Study in KGS Ranks

Post by RBerenguel »

RobertJasiek wrote:A fix for the rating system? Easy, use a different system:

- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.


I'm tempted to run a Monte Carlo simulation of such a system. Maybe I'll do, could be fun.
Geek of all trades, master of none: the motto for my blog mostlymaths.net
Charles Alden
Beginner
Posts: 10
Joined: Mon Jul 01, 2013 5:00 pm
Rank: AGA 2 dan
GD Posts: 0
Been thanked: 6 times

Re: A Curious Case Study in KGS Ranks

Post by Charles Alden »

Easy, use a different system:

- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.[/quote]

I'm tempted to run a Monte Carlo simulation of such a system. Maybe I'll do, could be fun.[/quote]


Under which system, in Mef's example the bot's rating would have moved to 34k the following day?
User avatar
HermanHiddema
Gosei
Posts: 2011
Joined: Tue Apr 20, 2010 10:08 am
Rank: Dutch 4D
GD Posts: 645
Universal go server handle: herminator
Location: Groningen, NL
Has thanked: 202 times
Been thanked: 1086 times

Re: A Curious Case Study in KGS Ranks

Post by HermanHiddema »

RobertJasiek wrote:A fix for the rating system? Easy, use a different system:

- +0.1 ranks for a win, -0.1 ranks for a loss.
- Ignore all handicap games (incl. those with handicap 1).
- Ignore games with a rank difference >2.
- Maximum rank 9d.


Which is deflationary. Every 20k that enters the system and moves up to 1d has removed 20 ranks total from the other players. That's no problem in a small playing pool like a club, where I think this kind of system is fine, as you can just manually recalibrate all ranks every once in a while, but on a go server it is unsuitable.

In a deflationary system, playing more games means you lose rating quicker. So you're replacing "My rating is stuck because I play so much" with "My rating keeps dropping because I play so much". How is that better?
Pippen
Lives in gote
Posts: 677
Joined: Thu Sep 16, 2010 3:34 pm
GD Posts: 0
KGS: 2d
Has thanked: 6 times
Been thanked: 31 times

Re: A Curious Case Study in KGS Ranks

Post by Pippen »

I am a 5D-Tygem and 1d-KGS. From my experience with Tygem I can say: Ranks at KGS are more stable and consistent than Tygem's. On Tygem you will find some more differences within one rank. Sometimes you play guys that seem like 1-2 stones weaker, sometimes 1-2 stronger, but all have the same rank. But here comes the advantage of such a thing: It's more fun, you have faster chances to get promoted/demoted and to play stronger players that wouldn't play you otherwise. KGS ranking is sounder, but more boring and since ranking is maybe the main motivation to play and stay in Go, it's significant.

I'd like KGS to copy Tygem's ranking system, i.e. a system of x-game series where you get promoted when you win y games and demoted when you lose z games out of it.
uPWarrior
Lives with ko
Posts: 199
Joined: Mon Jan 17, 2011 1:59 pm
Rank: KGS 3 kyu
GD Posts: 0
Has thanked: 6 times
Been thanked: 55 times

Re: A Curious Case Study in KGS Ranks

Post by uPWarrior »

It's funny how Robert just proposed removing all handicap games from the calculation while in a different topic I proposed that only handicap games should be considered so we don't rely on arbitrary win percentages.
Mef
Lives in sente
Posts: 852
Joined: Fri Apr 23, 2010 8:34 am
Rank: KGS [-]
GD Posts: 428
Location: Central Coast
Has thanked: 201 times
Been thanked: 333 times

Re: A Curious Case Study in KGS Ranks

Post by Mef »

Pippen wrote:I am a 5D-Tygem and 1d-KGS. From my experience with Tygem I can say: Ranks at KGS are more stable and consistent than Tygem's. On Tygem you will find some more differences within one rank. Sometimes you play guys that seem like 1-2 stones weaker, sometimes 1-2 stronger, but all have the same rank. But here comes the advantage of such a thing: It's more fun, you have faster chances to get promoted/demoted and to play stronger players that wouldn't play you otherwise. KGS ranking is sounder, but more boring and since ranking is maybe the main motivation to play and stay in Go, it's significant.

I'd like KGS to copy Tygem's ranking system, i.e. a system of x-game series where you get promoted when you win y games and demoted when you lose z games out of it.


KGS's rating system aims to provide the most accurate rank it can with all data available. It aims to do the best job of predicting the probable outcome between any two players and any handicap (though in practice it only accepts feedback from games H6 or less).

Tygem's rating system does not make any predictions. It does not handle handicap games. It does not make any attempt to ensure proper rank spacing. It suffers from large amounts of noise being introduced by players setting their own ranks. Under an ideal set of assumptions (all ranks properly spaced, all players properly ranked, etc) you still expect to spend 30% of your time at the wrong rank. Tygem's rating system has a place in the go world and many people find it fun. Accurately assessing your go strength and comparing yourself on a fixed scale to a pool of larger players isn't it.
Post Reply