Re: KGS ranking revisited
Posted: Fri May 11, 2012 10:21 pm
Mainly my objection is the system's design errors.jts wrote:So your objection is not that it's erratic per se
Life in 19x19. Go, Weiqi, Baduk... Thats the life.
https://www.lifein19x19.com/
Mainly my objection is the system's design errors.jts wrote:So your objection is not that it's erratic per se
This is a simplifying theory but not quite true. When the rating system is bad, then some players can play worse than usual because the system expects them to win much more than 50%, winning that much is tiring, and so they win less than they would if they were not forced into becoming tired. E.g., I (and others, from whom I have heard the same) can win ca. 10-12 games in a row, but then one becomes so tired than winning 20-24 games in a row is out of the question. Rather quickly lost games occur, first 1, then 2, then 4, then 8. The more tired the greater the percentage of lost games becomes.emeraldemon wrote:look at the average win-rate of every player over an appreciable number of games, and find the average distance from 50%.
It is entirely possible, however, that the player population on KGS, on average, plays enough games outside of the server to make the system more accurate than without this inflation. Moreover, the system obviously places less confidence in these inactivity inflated ranks meaning that if the increase in rank was not warranted the correction should in theory not take too long. Of course whether or not this works in practice depends on the playing habits of the player population, but I can say in the case of KGS that it works well enough for me.This is a simplifying theory but not quite true. When the rating system is bad, then some players can play worse than usual because the system expects them to win much more than 50%, winning that much is tiring, and so they win less than they would if they were not forced into becoming tired. E.g., I (and others, from whom I have heard the same) can win ca. 10-12 games in a row, but then one becomes so tired than winning 20-24 games in a row is out of the question. Rather quickly lost games occur, first 1, then 2, then 4, then 8. The more tired the greater the percentage of lost games becomes.
Please don't. If game-to-game variance is greater than komi as I suspect it as for almost all amateur players or if the average systematic error is greater than that, as it almost certainly is, it doesn't gain anything and isn't worth the confusion it would cause. It's a false precision. When IGS switched to half-ranks and therefore had games with reverse komi for 1-stone differences, it took me some time to adjust. I'm okay with it and now I don't have to recheck the komi in every game but with a continuous komi system I'd have to, so I'd probably just manually set it to some common value before the game rather than some microrank-derived setting.hyperpape wrote:One adaptation is to use all the variations of komi between 6.5 and 0.5 as appropriate. Of course this doesn't remove the problem entirely.
Do you say that an objective external measure of internal accuracy cannot exist or that so far nobody has described such yet?witwit wrote:there is no way to objectively measure accuracy like you can when judging the internal accuracy of the system
RobertJasiek wrote:This is a simplifying theory but not quite true. When the rating system is bad, then some players can play worse than usual because the system expects them to win much more than 50%, winning that much is tiring, and so they win less than they would if they were not forced into becoming tired. E.g., I (and others, from whom I have heard the same) can win ca. 10-12 games in a row, but then one becomes so tired than winning 20-24 games in a row is out of the question. Rather quickly lost games occur, first 1, then 2, then 4, then 8. The more tired the greater the percentage of lost games becomes.emeraldemon wrote:look at the average win-rate of every player over an appreciable number of games, and find the average distance from 50%.
So in a very real way, it's better to beat someone whose rating is going up than someone whose rating is trending down or staying flat, assuming of course that past performance says something about future results.jts wrote:Well, not necessarily. If your most recent partners decline, you'll decline to. It just assumes that, in the absence of evidence, you can still beat the same people and lose to the same people.
I meant that an objective measure of internal consistency does exist while an objective measure of consistency with external systems can only be defined by arbitrarily picking another system to compare against.RobertJasiek wrote:Do you say that an objective external measure of internal accuracy cannot exist or that so far nobody has described such yet?witwit wrote:there is no way to objectively measure accuracy like you can when judging the internal accuracy of the system
I am sure, based on the experience of dozens of thousands of games.emeraldemon wrote:Even if it's true that winning is more tiring than losing (which I'm not sure of),
Since one cannot say so, a rating system must avoid punishing players for becoming tired by having to win a too great percentage for too long a time.You can't say "Player A would beat Player B 80% of the time if Player A didn't have to win 80% of the time".
Not model - but avoid.are you trying to suggest that we should model this?
Arbitrarily picking another system is not an objective measure of consistency with external systems. Theoretical insight independent of particular external systems possibly can provide an objective measure. It is, however, still unclear which assumptions for theoretical insight can be called objective or arbitrary axioms. Getting a good answer on this is the real difficulty.witwit wrote:an objective measure of consistency with external systems can only be defined by arbitrarily picking another system to compare against.
Yes and no. Yes in that it made me decide that if I ever revisit the ranking system, Remi's system would be the first place I go for alternatives. No in that his paper reaffirmed my belief that the KGS system is "good enough" and there is no urgent need to replace it.emeraldemon wrote:Thanks for the link. wms, did the results of that study make you consider trying his algorithm?
Another research direction would be to improve the model. An efficient application of WHR to Go data would require some refinements of the dynamic
Bradley-Terry model, that the KGS rating algorithm [13] already has. In particular, it should be able to
– Handle handicap and komi.
– Deal with outliers.
– Handle the fact that beginners make faster progress than experts.
I think that hits the nail on the head.Kaya.gs wrote:My opinion is that accuracy is just one of the factors in a rating system. The psychology of it is very important. I think the key element that produces discontent with kgs's rating system is heavyness. Its an educated guess that the #1 reason for multiple accounts is the rating system.