Life In 19x19
http://www.lifein19x19.com/

Oddities in KGS ranking system
http://www.lifein19x19.com/viewtopic.php?f=24&t=4490
Page 1 of 4

Author:  tj86430 [ Thu Aug 18, 2011 10:52 am ]
Post subject:  Oddities in KGS ranking system

At the end of May I was barely 6k at KGS. I have since then played 5 ranked games, of which I have won 4. Now I'm 3k.

It sounds like too rapid improvement of rank to me.

Data: http://valkonen.kapsi.fi/keksi.php?user=tj86430

Author:  judicata [ Thu Aug 18, 2011 10:55 am ]
Post subject:  Re: Oddities in KGS ranking system

Your rank is affected by the improvement in your opponents' rank (even after you played them).

Author:  tj86430 [ Thu Aug 18, 2011 11:00 am ]
Post subject:  Re: Oddities in KGS ranking system

judicata wrote:
Your rank is affected by the improvement in your opponents' rank (even after you played them).

All the opponents I won during this period are still the same rank as when I played against them. Only the one I lost to has gained two kyu since we played.

It must be the opponents I have played a long time ago.

Author:  Mef [ Thu Aug 18, 2011 2:54 pm ]
Post subject:  Re: Oddities in KGS ranking system

The server had a very low confidence in your rank (the 5k?) going into July (There's an exponential decay on game weight)...you go 3-1, then win again in August...basically now the server treats you as if you've gone 4-1.......And then it looks at the fact that your loss was basically handicapped as a 3k. So the majority of your rank is based on the fact you have won 4 games as a 4k (with no losses), and suffered 1 loss as a 3k.

Author:  ez4u [ Thu Aug 18, 2011 4:26 pm ]
Post subject:  Re: Oddities in KGS ranking system

Why is any rating movement based on too few games "odd"?

Author:  Kaya.gs [ Mon Aug 22, 2011 7:40 am ]
Post subject:  Re: Oddities in KGS ranking system

Consider i will soon make a whole new thread about rating systems, finding one for Kaya.gs i do want to mention an odditiy about kgs rating system.


Besides accounts being heavy and such, there is an impressive psychological aspect of the system that does not feel to affect point-based systems like in Wbaduk or Tygem.

Back then when playing with danigabi[5d] account i have played certain 2ds giving them 3 handicap stones. I would win & lose, and i think i won a tad more than lost (say 60%). The impressive happens later. Right after losing a game, i would log back in with Rakuen[7d], and play the very same player with 6H. Suddently, i would win almost 80%.

How is it possible that increasing many stones , my chances to win go up. My current account, DexMorgan, has been brought up to 7d with a similar effect.

I think this is a specific anomaly of this history-based rating system, where the psychology of the palyers deeply affect the end results and hence its accuracy.

This is not a single or a few more games, it is proven by many strong players that giving enormous amounts of handicap give a really good edge. (While playing the default handicap is disadvantageous)

Author:  daniel_the_smith [ Mon Aug 22, 2011 8:16 am ]
Post subject:  Re: Oddities in KGS ranking system

Kaya.gs wrote:
Consider i will soon make a whole new thread about rating systems, finding one for Kaya.gs i do want to mention an odditiy about kgs rating system.


Besides accounts being heavy and such, there is an impressive psychological aspect of the system that does not feel to affect point-based systems like in Wbaduk or Tygem.

Back then when playing with danigabi[5d] account i have played certain 2ds giving them 3 handicap stones. I would win & lose, and i think i won a tad more than lost (say 60%). The impressive happens later. Right after losing a game, i would log back in with Rakuen[7d], and play the very same player with 6H. Suddently, i would win almost 80%.

How is it possible that increasing many stones , my chances to win go up. My current account, DexMorgan, has been brought up to 7d with a similar effect.

I think this is a specific anomaly of this history-based rating system, where the psychology of the palyers deeply affect the end results and hence its accuracy.

This is not a single or a few more games, it is proven by many strong players that giving enormous amounts of handicap give a really good edge. (While playing the default handicap is disadvantageous)


I think that is entirely explicable with psychology and how bad 2d's play in general. You try harder with more stones, the 2d tries less hard and also has the effect of "ah, I just won, now I can be lazy". The 2d could be bad at using handi, you could be good at making a whole board fight where additional stones don't help that much. Etc. I personally seem to be 4-5 stones weaker in casual games...

All rating systems make the incorrect assumption that strength can be expressed as a single value-- cycles where A beats B beats C beats A obviously happen all the time. You've just discovered such a cycle with 2 players and differing handicap stones. What I'm saying is that you'd have to invent a multidimensional rating system to find one that could make sense of the data you just reported-- I'm not aware of any that handle more than one dimension of strength.

TL;DR: similar inconsistencies can be found in ALL rating systems.

Oh, and-- I believe in the KGS system, white is expected to win 60% of the time in handicap games. This is because a 2 stone handi is really 1.5 stones (black should have 6.5/7.5 reverse komi to make it genuinely a 2 stone game).

Author:  karaklis [ Mon Aug 22, 2011 8:27 am ]
Post subject:  Re: Oddities in KGS ranking system

The ranking system of KGS assumes that the improvment development of go players is the same. Other systems such as that of IGS assume that it does not change if you don't play (there).

Both assumptions are not correct, but how do you want to measure the real improvement in order to obtain matches with a balanced win percentage? It's actually impossible. A compromise would be to keep the rank/rating on not playing, but to make it more volatile, so that you can quickly make it to your real strength after a longer playing break. I don't have detailed knowledge of the ranking/rating system of OGS, but it seems that this idea has been implemented there. Maybe it's not a bad idea to have a look at the system there.

Author:  hyperpape [ Mon Aug 22, 2011 8:32 am ]
Post subject:  Re: Oddities in KGS ranking system

karaklis wrote:
The ranking system of KGS assumes that the improvment development of go players is the same.
What do you mean by this?

Author:  daniel_the_smith [ Mon Aug 22, 2011 8:37 am ]
Post subject:  Re: Oddities in KGS ranking system

WHR (Whole History Ratings) does correctly what KGS is attempting to do; KGS will change your rating now if the guy you beat a month ago gets stronger. WHR will change your rating of a month ago if it decides the guy you beat a month ago was actually at that time a stone stronger that the rating system thought he was.

KGS computes a rating for each player (the chart is created by appending the rating it calculates once per day).

WHR computes an entire rating history for each player each time it is run. IOW, every day, WHR will compute a (slightly) different chart of your rating over time.

WHR is the most theoretically advanced rating system I'm aware of, I really wish it were used somewhere...

Hm. And my last post made me want to invent a multi-dimensional rating system...

Author:  snorri [ Mon Aug 22, 2011 9:30 am ]
Post subject:  Re: Oddities in KGS ranking system

Kaya.gs wrote:
How is it possible that increasing many stones , my chances to win go up.


That's a pretty impressive example. Maybe the best rating system is a blind one where you don't know how strong your opponents are until later :D

There are some questions as to whether the traditional handicap system / star-point placement really compensates for strength differences accurately. I once heard a pro say that he wanted to define amateur 1d as a certain amount of reverse komi taking black against a pro rather than as a multi-stone handicap because it is "easy for the stronger player to make the handicap stones useless and it doesn't depend so much on the number of stones, but points are points and their value can't be erased."

Author:  uPWarrior [ Mon Aug 22, 2011 11:04 am ]
Post subject:  Re: Oddities in KGS ranking system

The problem is that adding a new stone has less and less impact as the number of stones grows.

E.g.:
7d vs 2d at 5 handi and -6.5 komi, the expected win rate is 50%.
However, 7d vs 2d at 6 handi and -6.5 komi, the expected win rate is 79% for black.

Does anyone believe that a single new handicap stone could produce this difference in winning percentages? Even if black started winning 75% of his games with this new stone, his rating would go down.

I think that any sufficiently robust system should have this into account, either by decreasing the compensation value of each handicap stone as stones get added or by decreasing the impact of high handicap games in rating itself.

Author:  daniel_the_smith [ Mon Aug 22, 2011 11:11 am ]
Post subject:  Re: Oddities in KGS ranking system

uPWarrior wrote:
The problem is that adding a new stone has less and less impact as the number of stones grows.

E.g.:
7d vs 2d at 5 handi and -6.5 komi, the expected win rate is 50%.
However, 7d vs 2d at 6 handi and -6.5 komi, the expected win rate is 79% for black.

Does anyone believe that a single new handicap stone could produce this difference in winning percentages?


Sure, I could believe it, but I'd also like to know where your figures came from :)

BTW, AFAIK, EGF, AGA and KGS all do, in fact, take things like that into account.

Author:  xed_over [ Mon Aug 22, 2011 12:04 pm ]
Post subject:  Re: Oddities in KGS ranking system

hyperpape wrote:
karaklis wrote:
The ranking system of KGS assumes that the improvment development of go players is the same.
What do you mean by this?

he means that even if you don't play for a while (on KGS), the next time you logon, your rank will have increased, because the opponents you played in the past have increased.

KGS tries to make the assumption that even if you're not playing on KGS, that you're still playing somewhere and improving. And it does that by comparing you with your past opponents.

That's how I became SDK -- I quit playing.

Author:  hyperpape [ Mon Aug 22, 2011 3:50 pm ]
Post subject:  Re: Oddities in KGS ranking system

Is that what it assumes? Daniel's explanation fit what I understood better--it's not that KGS believes that you're improving, it's that as it gets more information on your opponents' skill (because they play more games), it changes its estimate of your strength ("that guy you beat was actually 2 dan, not 3 kyu, so that's way more impressive).

In fact, that seems to be based on the assumption that your strength as well as the strength of your previous opponents are relatively stable. Otherwise, it's meaningless that your opponent from three months ago was 2 dan. Another factor is that older games are weighted less heavily than more recent games.

Author:  wms [ Tue Aug 23, 2011 9:25 am ]
Post subject:  Re: Oddities in KGS ranking system

KGS is assuming that all ranks are stable, but the end result is the same as assuming that you improve with your opponents when you aren't playing.

When I did the research for the current KGS rank system, I did code up a system that would assume each player was improving at a constant rank, and it would try to compute the slope for your rank that best fit the available data. It made the rank system a lot more complex and run a lot slower, but in the end it made the system no better at predicting the outcome of future games (which is what I used as my metric for accuracy), so I took that algorithm out.

Author:  uPWarrior [ Tue Aug 23, 2011 11:12 am ]
Post subject:  Re: Oddities in KGS ranking system

daniel_the_smith wrote:
uPWarrior wrote:
The problem is that adding a new stone has less and less impact as the number of stones grows.

E.g.:
7d vs 2d at 5 handi and -6.5 komi, the expected win rate is 50%.
However, 7d vs 2d at 6 handi and -6.5 komi, the expected win rate is 79% for black.

Does anyone believe that a single new handicap stone could produce this difference in winning percentages?


Sure, I could believe it, but I'd also like to know where your figures came from :)


http://senseis.xmp.net/?KGSRatingMath

See expected win rates given rank differences.

Author:  flOvermind [ Fri Aug 26, 2011 6:52 am ]
Post subject:  Re: Oddities in KGS ranking system

xed_over wrote:
KGS tries to make the assumption that even if you're not playing on KGS, that you're still playing somewhere and improving. And it does that by comparing you with your past opponents.


That's not true.

But KGS makes the assumption that it does not "know" your rank or the rank of your opponents for sure at any point in time. If its guess of the rank of a player changes (e.g. because it gets more data, that is, the player plays more games), that means that the previous guess must have been wrong. So it needs to adjust everything calculated with this "wrong guess", including the ratings of opponents that haven't played in a while.

So the problem is actually the other way round: Because it assumes the rank of players *does not change* in time, it has to correct your rating all the time as the rank of others change. And because the rank of the average Kyu will rather go up than down, that typically results in a rank drift upwards when you don't play. Note that this problem can't be solved by simply inserting the assumption that players improve at a constant rate. It would still have to re-calculate everything whenever the rank changes, but now with a linear backwards interpolation instead of a simple constant, but with both players having the same assumed improvement rate, that wouldn't actually change anything.

IGS for example has the opposite problem: It assumes that its knowledge of the current rank is perfect, and any change to this number reflects a real change in skill, which is of course ridiculous. That assumption is actually a lot worse than the KGS assumption. That way, every misranked player hurts the long-term accuracy of the rating system, while in the KGS system there are only short-term effects, and the system will adapt with time as it gathers more data.


The solution to both problems, as has already been mentioned: WHR.
This system both assumes that ratings change over time, and that the knowledge of the "true" rating is always just a guess that can change later due to more data being available.

Author:  Kaya.gs [ Tue Aug 30, 2011 10:22 am ]
Post subject:  Re: Oddities in KGS ranking system

wms wrote:
KGS is assuming that all ranks are stable, but the end result is the same as assuming that you improve with your opponents when you aren't playing.

When I did the research for the current KGS rank system, I did code up a system that would assume each player was improving at a constant rank, and it would try to compute the slope for your rank that best fit the available data. It made the rank system a lot more complex and run a lot slower, but in the end it made the system no better at predicting the outcome of future games (which is what I used as my metric for accuracy), so I took that algorithm out.



I understand that the kgs rating system is more sophisticated than Wbaduk for example. I am not a ratings master at all, and i should start understanding a lot more about this things.

First of all, how do we know a rating system is accurate? How can we compare accuracy between KGS and Wbaduk?.

I do believe Wbaduk has a higher sample of players, which means it should present less inacuracy. However they have the issue that from 3d to weak 7d they are almost the same strength, and then inside 7d, you feel 2 stones difference.
I dont know why that happens.

I think Kgs rating system feels very good from say, 8k to 2d. From 3d up it starts to feel a little funky, but its probably due to the lack of players. I can say that from 6d up, i have a certain disbelief for ranks.


Back when KGS showed up i remember that IGS(as Wbaduk) required you to play 20 games to get a solid rank, and that sucked.
But what i cant stand on KGS is that accounts get heavy. Its feels like you are carrying a cross for all our previous losses, which is why people constantly make new accounts. In Wbaduk, maybe because of all the games needed to get a solid rank, i only have 1 account and i havent met anyone trying to make a second one. I've never been unhappy with my Wbaduk rating, and it has moved a lot over time.


My feeling with history-based rating is that it tries to assign you a rank basically on average. So if you lose to 7d and beat 5d you are 6d. But the truth is that sometimes you play like 7d, and sometimes like 5d.

With point based systems, as you play you approach the strengh you have right now, not the average, which i think is natural and better.
Example:
KGS: i play and lose X games with 7d, then lose X games with 6d. then i win X games with 5d and win X games with 6d. Given reasonable time-frames, i would probably be in 6d.
Wbaduk: i play and lose X games with 7d, then i go down and lose X games with 6d. After that im 5d. Then i win X games, get to 6d, win X games, get to 7d.

I cant give hard examples with numbers of the top of my head, but i think this gets my point across.

What do you guys think?

I keep promissing i will make the thread about rating, it will be up soon :)

Author:  HermanHiddema [ Tue Aug 30, 2011 10:47 am ]
Post subject:  Re: Oddities in KGS ranking system

Playing strength varies enormously depending on all sorts of conditions, such as thinking time, alcohol, lack of sleep, or whatever. Any rating system that tries to capture that playing strength in a single number is guaranteed to be inaccurate in that respect. That's why a rating system like Glicko also reports a deviation. So a 4kyu with deviation of 2 is 95% likely to play with a strength between 6kyu and 2kyu. That does not mean there is some precise actual strength between 2kyu and 6kyu that they really are. Rather, it means that even though their playing strength varies, the playing strength in any one game is very likely to be between those values.

But of course, for all sorts of purposes, from determining handicap to sorting players, you very much need a single number.

Now I think that often, players themselves are very much aware when their own strength is likely to be better or worse than their average. That is why people create separate accounts for blitz, or for playing casually instead of seriously. They don't want games that are likely to be bad to damage their rating too much.

Now there may be some ways to work around this issue, based on the player's own knowledge. Here's a few ideas:

Allow a player to secretly mark a game as "bad" before their first move. If they mark it as such, it will count less heavily for the rating (say, only 50%). This way, if you're tired, drunk or otherwise not in great shape, you can play with your main account with less chance of damage to your rating.

Allow a player to earn the right to a temporary promotion. For example: If a player wins 4 games in a row, they get one "promotion credit", with which they can start a single game at one rank higher than their usual rank. This is invisible to the opponent. Such a game, because it is played at the normal handicap for one rank higher, gives a player a chance to gain rating more quickly. I think many players would be really psyched to earn and play such games.

Allow a player to request a reevaluation every X games (say 50). If they do this, their next 3 games count more strongly for their rating. This allows a player who feels that his rating is lagging to quickly gain some points. The game counts for the opponent's rating as usual, not extra.

Page 1 of 4 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/