It is currently Fri Apr 26, 2024 9:40 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 52 posts ]  Go to page 1, 2, 3  Next
Author Message
Offline
 Post subject: Can We Stop Calling Kata "scoreMean" Points?
Post #1 Posted: Wed Dec 11, 2019 1:24 am 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
It is misleading and wrong. I just had a wild discussion where the consensus was that professionals regularly get into positions with a 5-10 point differential by move 30 "because Kata says so and it is stronger than them". Horrifying.

I would guess that of ten kyu level Katago users, more than half are misled by the current labeling of this feature to one extent or another. It will definitely handicap some people who decide to learn to count on their own in the future.

The most misleading aspect of this value is that it obviously is not the same meaning early in the game as it is later on in the game. A point(scoring term) in a Go game is worth the same on move 0 and on move 350.

Here is an album showing how KataGo heavily overestimates mistakes. That is just one (the worst kind of) mistake which is quite simply a 15 point mistake that Kata's current algorithm doesn't understand. Now imagine games where players(professionals) are making multiple moves that KataGo doesn't agree with - it quickly reaches a point where Kata believes there is regularly a ten point differential before fifty moves. Any strong player understands this is false. Anyone who understands networks or computer programming might also understand this as well(though you may be surprised). Others can't/don't want to understand.


This post by emerus was liked by: Applebaps
Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #2 Posted: Wed Dec 11, 2019 2:05 am 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
Um, so could you show us a realistic middlegame position (not where someone has stupidly passed in the opening) where you think KataGo is out by ten points?

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #3 Posted: Wed Dec 11, 2019 5:05 am 
Dies in gote

Posts: 52
Liked others: 2
Was liked: 12
Similar to 'winrate' the interpretation of KataGo scoreMean/Points should be that this is the expected score difference at the end of the game, based on the experiences learned from the training games of the program.

The obvious 'exaggeration' is that the side that is behind will take risks, which would often increase the score difference, compared to had the player aimed to lose by the least amount of points. This effect is easily seen if you set komi to 0, where a 9 point difference is expected.
Note that we do not know if a similar effect could be seen in human play, so it's not fair to say that it is outright wrong.

It is however reasonable to discern that if you for instance go from a 'scoreMean' difference of -20 to 0, then it does not necessarily mean that there was a 20 points (local, countable) mistake.

Now if you're comparing to traditional human counting, where most people discard assigning value to 'moyos'/'influence'/'potential'/'weak groups' etc. then of course you will get a big difference! But to reserve the word 'points' for that kind of estimate, I do not think is fair either.
That is the great improvement of this kind of program, that it can give a meaningful number that takes into account every aspect of the board!

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #4 Posted: Wed Dec 11, 2019 5:12 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
emerus wrote:
It is misleading and wrong. I just had a wild discussion where the consensus was that professionals regularly get into positions with a 5-10 point differential by move 30 "because Kata says so and it is stronger than them". Horrifying.


So we are talking about winrate estimates of 60-40, more or less?

Quote:
I would guess that of ten kyu level Katago users, more than half are misled by the current labeling of this feature to one extent or another. It will definitely handicap some people who decide to learn to count on their own in the future.


Counting is a static feature which is not an estimate of the final score. We should not expect the two to agree, until the end of the game.

Quote:
The most misleading aspect of this value is that it obviously is not the same meaning early in the game as it is later on in the game. A point(scoring term) in a Go game is worth the same on move 0 and on move 350.


The current count of the empty board is plainly 0. When people learn to count, this static evaluation is what they attempt to learn. Now, based upon statistics, we estimate that the final count between human pros will average around
7 points for Black. Even with players that strong, there is a lange uncertainty. If resigned games were played out, we would surely see some differences greater than 20 points. I challenge anyone who has no knowledge of game statistics to come up with a reasonable estimate of the final score at move 30. That's not what people do when they learn how to count.

Is there any go program in the world that is designed to estimate the current count of the go board at move 30, or move 50, or move 100? No. Bots, including KataGo, are designed to win games. I repeatedly make this point until I am blue in the face. Winrate estimates are used to help bots choose plays, not to estimate the statistics of winning and losing. To do that, they do not have to be accurate at any stage of the game. Notoriously, they may not even be accurate at the end of the game.

If Black opens on a 4-4 point, we currently estimate the current count to be around 15. We do not reach that estimate by counting, but by our knowledge of game statistics. 100 years ago, pros estimated the current count to be around 10. They appear to have been off by around 50%. Anyway, human counting before the endgame is wildly inaccurate, and KataGo doesn't even try to do that.

Now, I think that learning to estimate the current static count is a valuable thing for humans, and, with the help of computer programs, we may come up with good methods in the future. But right now, AFAICT, nobody is writing programs to help us do that. Bots are not programmed to evaluate positions accurately, but to win games.

Quote:
Here is an album showing how KataGo heavily overestimates mistakes. That is just one (the worst kind of) mistake which is quite simply a 15 point mistake that Kata's current algorithm doesn't understand.


I take it that you came up with the 15 point estimate, not from counting the board, but from game statistics. :)

Quote:
Now imagine games where players(professionals) are making multiple moves that KataGo doesn't agree with - it quickly reaches a point where Kata believes there is regularly a ten point differential before fifty moves. Any strong player understands this is false.


Well, we disagree about that. Any strong player should know that it might well be true. Any human that believes that humans can accurately estimate the final score after 50 moves is sadly mistaken.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #5 Posted: Wed Dec 11, 2019 5:28 am 
Gosei
User avatar

Posts: 1754
Liked others: 177
Was liked: 492
A better way of measuring the size of the mistake would be the following: if m is a move number, let S(m) be the value of the komi so that, at move m, Katago thinks that the game is even (50% chances of winning).

Then the size of the mistake at move m is the difference between S(m) and S(m-1).

Unfortunately Katago doesn't compute that number automatically.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #6 Posted: Wed Dec 11, 2019 5:45 am 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
Bill Spight wrote:

Quote:
Here is an album showing how KataGo heavily overestimates mistakes. That is just one (the worst kind of) mistake which is quite simply a 15 point mistake that Kata's current algorithm doesn't understand.


I take it that you came up with the 15 point estimate, not from counting the board, but from game statistics. :)



Have to get some sleep before responding but I gather from replies that several of you miss the point of the album. If you pass in a mirror position, the "value" of that mistake is komi x2. KataGo is calling it 19 or 20. Overestimating the value of early game mistakes is a regular habit of it.

The point is that us, as stronger players discussing these positions, should call them something besides points. It isn't a good thing at all. Points already exist and have a definition. At least, if you want to be borderline misleading - call them KataGo Points, but I think we can do better.

xela wrote:
Um, so could you show us a realistic middlegame position (not where someone has stupidly passed in the opening) where you think KataGo is out by ten points?


This is the most useless thing I read today. You want me to find proof for your strawman? :)

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #7 Posted: Wed Dec 11, 2019 6:58 am 
Dies in gote

Posts: 52
Liked others: 2
Was liked: 12
Well, if you only object to the program 'exaggerating' the difference of this '20 point mistake early in the game', then the misconception is not about whether or not KataGo estimates points, but rather that KataGo is not estimating the 'expected points difference under perfect point optimizing play', but instead is estimating the expected points difference at the end of the game by the experience of its self-play games. Which takes into account that the losing side might play risky in order to try to win, which is the main goal of the program as Bill said.

And I agree that is a valid concern, which is important to educate users of the program about

Edit: You say that the term 'points' has a clear definition. Perhaps you would like to share that definition? If it's something like what I just mentioned, then your concern makes good sense ^^

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #8 Posted: Wed Dec 11, 2019 8:21 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
If instead of passing we make some pertubation to an early game position that is simply an unconditional endgame loss of some small number of countable points with minimal aji implications (may be hard to remove ko threat implications) how well does KataGo's scoremean correspond to that?

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #9 Posted: Wed Dec 11, 2019 8:35 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
emerus wrote:
Bill Spight wrote:

Quote:
Here is an album showing how KataGo heavily overestimates mistakes. That is just one (the worst kind of) mistake which is quite simply a 15 point mistake that Kata's current algorithm doesn't understand.


I take it that you came up with the 15 point estimate, not from counting the board, but from game statistics. :)



Have to get some sleep before responding but I gather from replies that several of you miss the point of the album. If you pass in a mirror position, the "value" of that mistake is komi x2.


No, that is exactly what I meant. Komi is deterimined statistically.


Quote:
KataGo is calling it 19 or 20. Overestimating the value of early game mistakes is a regular habit of it.


Apparently so. And, if Yakago is correct, that is based upon estimating the score of KataGo's self-play games, for which it may be correct. The reason being that KataGo, believing itself to be behind in those games, takes risks that cost points, on average. Human players may not take such risks as early in the game as KataGo. Or perhaps, if we had scores for human resignations, we would come up with the same estimate as KataGo. Komi is based upon the median result, not the average. IIUC, KataGo's estimate is an average.

Quote:
The point is that us, as stronger players discussing these positions, should call them something besides points.


OK. Call them average final score estimates, which is what they are.

Quote:
Points already exist and have a definition.


Points do not already exist, in the sense that scores do. They are not scores, nor are they estimates of scores. They have a definition, but are intractable to calculate before the endgame.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.


This post by Bill Spight was liked by: emerus
Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #10 Posted: Wed Dec 11, 2019 8:38 am 
Gosei
User avatar

Posts: 1754
Liked others: 177
Was liked: 492
P.S. To illustrate my previous post:

at move 6, with komi 7, the game is almost even.

Attachment:
Capture.PNG
Capture.PNG [ 1.16 MiB | Viewed 8908 times ]


Then Black passes, and komi is reset at -7. The game is again almost even.

Attachment:
Capture2.PNG
Capture2.PNG [ 1.19 MiB | Viewed 8908 times ]


So passing in the early stages of the game (move 7) is a mistake of about 14 points.


This post by jlt was liked by 2 people: Bill Spight, emerus
Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #11 Posted: Wed Dec 11, 2019 1:25 pm 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
Yakago wrote:
Edit: You say that the term 'points' has a clear definition. Perhaps you would like to share that definition? If it's something like what I just mentioned, then your concern makes good sense ^^


Zi or moku. White (usually) starts out of them and they are clearly defined. Examples: a prisoner is a moku, a live stone on the board is a zi.

Bill Spight wrote:
Is there any go program in the world that is designed to estimate the current count of the go board at move 30, or move 50, or move 100? No. Bots, including KataGo, are designed to win games. I repeatedly make this point until I am blue in the face. Winrate estimates are used to help bots choose plays, not to estimate the statistics of winning and losing. To do that, they do not have to be accurate at any stage of the game. Notoriously, they may not even be accurate at the end of the game.


I repeat this too. It is why scoreMean (especially the way the common KataGo users understand it) is misleading. It isn't even used by the program to win games. If we want some arbitrary number, we already have %'s. Of course, the idea to translate from something so arbitrary as %'s to something more meaningful to the user is commendable and I think that it should be pursued.

Bill Spight wrote:
OK. Call them average final score estimates, which is what they are.


Yes please. My plea doesn't target you obviously. The problem that I see is that KataGo is very much the popular choice for the weaker players and it is because they are told that "X mistake is a 2.5 point mistake" or they think they are being told the current score differential.

Bill Spight wrote:
Points do not already exist, in the sense that scores do. They are not scores, nor are they estimates of scores. They have a definition, but are intractable to calculate before the endgame.

They can be calculated before endgame simply(speaking of move values). If you remove an opponent's stone from the board in Chinese rules, you deny them a point and if you capture a prisoner in Japanese rules, you gain a point. It is clear that they are not as intractable as you make it sound. Obviously, most of the posters in this thread do understand that KataGo doesn't know what a point is. I don't think that is true of other users, even some on L19x19.

Point differentials are as intractable as you say and humans are god awful at it... but AI do not do it accurately either. It is a failure that we let so many AI-users believe that this is not the case.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #12 Posted: Wed Dec 11, 2019 2:23 pm 
Gosei

Posts: 1733
Location: Earth
Liked others: 621
Was liked: 310
I think this is an example of a straw man fallacy :twisted:

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #13 Posted: Wed Dec 11, 2019 2:46 pm 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
emerus wrote:
xela wrote:
Um, so could you show us a realistic middlegame position (not where someone has stupidly passed in the opening) where you think KataGo is out by ten points?

You want me to find proof for your strawman?

No, I am honestly trying to understand what you're talking about --
emerus wrote:
Now imagine games where players(professionals) are making multiple moves that KataGo doesn't agree with - it quickly reaches a point where Kata believes there is regularly a ten point differential before fifty moves.

I really think an example would help.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #14 Posted: Wed Dec 11, 2019 2:52 pm 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
Gomoto wrote:
I think this is an example of a straw man fallacy :twisted:


Gomoto wrote:
What is the fuzz all about?

KataGo has a strong opinion about this opening. With the marked move white takes the lead with about 3 points early in the opening.


Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #15 Posted: Wed Dec 11, 2019 3:11 pm 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
How about I just make the next training run of KataGo include a prediction target that consists of "what number of points would be needed to make the estimated winning chance close to 50-50" rather than "what is the average difference in final points that will result from self-play" and use this prediction as the value to report to users instead?

With some thought, I think I have settled on a training method that I think should be effective for this.


This post by lightvector was liked by 4 people: Bill Spight, emerus, marvin, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #16 Posted: Wed Dec 11, 2019 3:16 pm 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
xela wrote:
I really think an example would help.


3rd game I opened: Game here


Not +/-10 but I am not going to look very hard for something that I've seen at least 1/10 of the games I open into KataGo. If you are a user of KataGo and haven't noticed this by now, then you should look for it.

How often do you think professionals in post-AI age actually have such a large (>5 scoreMean) deficit by move 41? KataGo thinks it is like 10% of the time. It is ludicrous to me.

edit:
lightvector wrote:
How about I just make the next training run of KataGo include a prediction target that consists of "what number of points would be needed to make the estimated winning chance close to 50-50" rather than "what is the average difference in final points that will result from self-play" and use this prediction as the value to report to users instead?

With some thought, I think I have settled on a training method that I think should be effective for this.


I chose this forum for my plea/rant because I know you are active(also thought about your GitHub). I do think a simple clarity fix would go a long way, though maybe the cat is already out of the bag. Any improvement (especially this one) is also great. ^^


Attachments:
JDOtnBU.png
JDOtnBU.png [ 1.05 MiB | Viewed 8800 times ]
Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #17 Posted: Wed Dec 11, 2019 3:32 pm 
Gosei

Posts: 1733
Location: Earth
Liked others: 621
Was liked: 310
emerus, the straw man fallacy is that you imply people who are talking about Katago score as "points" are dumb because they are using a certain analogy that does not exist according to you. When in fact talking about points is just a convinient way to compare the values of plays.

I am by the way not offended in any way by your argument, I just think it is wrong.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #18 Posted: Wed Dec 11, 2019 3:35 pm 
Gosei

Posts: 1733
Location: Earth
Liked others: 621
Was liked: 310
lightvector, I think the Katago score is a fine tool as it is.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #19 Posted: Wed Dec 11, 2019 3:53 pm 
Gosei

Posts: 1733
Location: Earth
Liked others: 621
Was liked: 310
The Katago score is a really good indication of how many points score difference there will be at the end of the game if both players make optimal plays (never the case) or similar size mistakes (often the case if both players have a similar strength).

emerus, if you think this is a misleading way to analyze go games with Katago and should be avoided or improved, than I am guilty.

Top
 Profile  
 
Offline
 Post subject: Re: Can We Stop Calling Kata "scoreMean" Points?
Post #20 Posted: Wed Dec 11, 2019 4:19 pm 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
Gomoto wrote:
emerus, if you think this is a misleading way to analyze go games with Katago and should be avoided or improved, than I am guilty.


It isn't a misleading way to study. It's a useful, good tool.

Comments like "White is ahead by 3 points on move 6" are misleading. I am not sure how often you (or other forum regulars) interact with 2k-10k players who use KataGo. They usually believe these are points or that KataGo is at least trying to determine points(in the cases where they are aware it isn't an easy/possible task). This is what is misleading.

I honestly do not know where calling them points began, I assume it is because nothing intuitive or catchy was proposed. Endgame scoreMean estimations based of training data is a mouthful and doesn't have quite the ring to it. ;-)

edit:
Gomoto wrote:
The Katago score is a really good indication of how many points score difference there will be at the end of the game if both players make optimal plays (never the case) or similar size mistakes (often the case if both players have a similar strength).


Had to break this quote after re-reading it. How can you say that KataGo score is a "really good indication" ...? Do you know what optimal plays are? KataGo doesn't. The fact that the scoreMean is from training data and not match data also clearly says that it isn't even trying to use optimal plays to gather this value.

Eh, last point for emphasis. It is a fine tool. It surely beats AI %'s. We can strive for better and at the very least at least make it clearer what exactly the value that the tool is giving you means.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 52 posts ]  Go to page 1, 2, 3  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group