Score mean versus probability

Gomoto · Post by **Gomoto** » Sun Nov 17, 2019 7:14 am

I think the points difference shown by Katago is a better learning tool for me than the win probability. I can evaluate and compare moves in every phase of the game without having to adjust to the vastly differences in probability swing of a 2 point mistake in the opening and in the endgame for example.

Nowadays I turn off probability and only look at the points during analyzing with KataGo. Only when I compare interesting variations with ELF and LZ I switch probabilities on again.

Thank you very much lightvector, I enjoy analyzing go even more with KataGo. It is such a nice program.

("even more" and "such a" are reminiscent to two of my great human teachers by the way. Perhaps some of you can guess them correctly

)

xela · Post by **xela** » Sun Nov 17, 2019 2:58 pm

I agree with most of this. Especially the thanks to lightvector and all the other open source programmers who share so freely :-)

Just a word of caution: it's possible to have a positive score together with a bad winrate . Example: if I can kill my opponent's big group, I win by 50 points; if I can't kill, I lose by 2 points. Probably I can't kill, but the small possibility makes my average score look good even though I'm losing the game.

If you look exclusively at scores, you might learn some risky behaviours. Perhaps best to have both numbers in front of you, look mostly at the scores, but keen an eye open for times when the winrate tells a different story?

Bill Spight · Post by **Bill Spight** » Sun Nov 17, 2019 3:14 pm

xela wrote:I agree with most of this. Especially the thanks to lightvector and all the other open source programmers who share so freely

Just a word of caution: it's possible to have a positive score together with a bad winrate . Example: if I can kill my opponent's big group, I win by 50 points; if I can't kill, I lose by 2 points. Probably I can't kill, but the small possibility makes my average score look good even though I'm losing the game.

This is why I prefer the median score to the mean. Statistical komi is a median, for instance.

If you look exclusively at scores, you might learn some risky behaviours. Perhaps best to have both numbers in front of you, look mostly at the scores, but keen an eye open for times when the winrate tells a different story?

If you look at scores you have to consider the temperature, as well. For instance, if you are 2 pts. behind but the temperature is 6, (Edit: and you have the move,) you have a good chance of winning, as a rule. Winrates, whatever their flaws, are predictive in themselves. You do need to know the number of visits or playouts as a confidence measure, but the bots can tell you that. I agree that combining estimated scores with estimated winrates is a good idea.

lightvector · Post by **lightvector** » Sun Nov 17, 2019 4:11 pm

I second what xela said. Do take some care when you find yourself in a situation that involves a critical large dragon, or a difficult semeai, or a massive and yet unclear ko, or other situation that may make the game very very swingy in a way that KataGo has some uncertainty about.

But otherwise, glad to hear it.

mhlepore · Post by **mhlepore** » Sun Nov 17, 2019 5:08 pm

Forgive the question if it is a non-issue as I haven’t kept up with the details of this stuff…

I recall reading a while back that bots with a lead will sometimes play sub-optimally in the endgame to ensure their win. That is, trade down the expected margin of victory for an increase in probability of victory.

What assumptions does Katago make about expected margin of victory with respect to this issue (if it is actually a thing)? Is laying off the gas a bit baked into the score estimation? If so, is it just a point or so?

Bill Spight · Post by **Bill Spight** » Sun Nov 17, 2019 9:21 pm

mhlepore wrote:I recall reading a while back that bots with a lead will sometimes play sub-optimally in the endgame to ensure their win. That is, trade down the expected margin of victory for an increase in probability of victory.

Well, the impression that many people, myself included, have is that top bots, going back to the MCTS bots before AlphaGo, typically win games by smaller margins than an amateur dan typically would, and maybe even weaker humans. The claim has been made in the bots' defense that they give up points in order to secure the win. To my mind, that claim has never been proven. OTOH, I am unaware of anybody coming up with a case where a top bot would have lost a game versus human play because of giving up a few points in the endgame.

There was a case a while back where a top bot lost a point at the end of play by unnecessarily filling in a point of territory, thus losing the game by ½ pt. But that was by territory scoring with a 6½ pt. komi, which is not the game the bot was playing.

The main problem with the defense of the bots, it seems to me is what is meant by a winrate. IIUC, a winrate estimate assumes that the bot is playing against itself. That weakens the defense argument, because a bot could well have a blind spot that it would share with itself as the opponent, but which a different opponent would exploit. The argument then becomes that the bots make objectively suboptimal endgame plays that increase their estimate of the odds of winning the game against a player that makes the same mistakes that it does. Hardly compelling.

We already know that strong amateurs are still better than the bots in certain situations such as those with long ladders and large semeai. Humans are good at depth first search in local situations, local being a fuzzy concept. By contrast today's top bots do a kind of best first search over the whole board. That can put them at a disadvantage versus humans. Because a game of go tends to divide into a number of local situations in the endgame, human play can approach perfection, because depth first local search pays off. There is still the question of which local region to play in, but humans have good heuristics and algorithms for that. Anyway, I doubt that any of today's top bots could solve every problem in Berlekamp and Wolfe's Mathematical Go, if they were amended for a 7½ pt. komi.

I have not been motivated to look for endgame mistakes by top bots because, well, who cares? And I am not at all sure that top bots of 2018 and 2019 make game losing endgame errors (Edit: simply by making small plays). For instance, I ran across an example in the Elf commentaries where Elf recommended filling a ⅓ pt. ko instead of nailing down the win by filling a larger ko, so that it would not matter whether it won the ⅓ pt. ko or not. What human would play that way? Well, as it turns out, Elf would have won the larger ko, as well, so no harm done.

dfan · Post by **dfan** » Mon Nov 18, 2019 6:08 am

mhlepore wrote:Forgive the question if it is a non-issue as I haven’t kept up with the details of this stuff…

I recall reading a while back that bots with a lead will sometimes play sub-optimally in the endgame to ensure their win. That is, trade down the expected margin of victory for an increase in probability of victory.

What assumptions does Katago make about expected margin of victory with respect to this issue (if it is actually a thing)? Is laying off the gas a bit baked into the score estimation? If so, is it just a point or so?

KataGo cares about the margin of victory and tries to win by more (or lose by less) if possible, so it should make fewer slack endgame moves in the first place. It does this by using a reward function that dispenses a higher award for bigger wins. You can see the function in Appendix F of the paper. (Edit: I should add that the graph in the appendix looks like "the function" but in fact it is just one dynamic component of it.)

Of course you have to be careful with bonuses like this because if the bonus is too small, it can get lost in the noise, and if the bonus is too big, it can induce the bot to take unnecessary risks (e.g., getting excited about possibly winning by 1.5 instead of 0.5, even though that opens up the possibility of losing by -0.5).

Bill Spight · Post by **Bill Spight** » Mon Nov 18, 2019 10:06 am

dfan wrote: KataGo cares about the margin of victory and tries to win by more (or lose by less) if possible, so it should make fewer slack endgame moves in the first place. It does this by using a reward function that dispenses a higher award for bigger wins. You can see the function in Appendix F of the paper. (Edit: I should add that the graph in the appendix looks like "the function" but in fact it is just one dynamic component of it.)

Appendix F? I got 38 pp. up to Appendix D.

dfan · Post by **dfan** » Mon Nov 18, 2019 10:11 am

Bill Spight wrote:Appendix F? I got 38 pp. up to Appendix D.

I bet you are looking at an earlier version of the paper. The link I gave should have gone to the latest version by default, but here it is explicitly: https://arxiv.org/abs/1902.10565v3

mhlepore · Post by **mhlepore** » Mon Nov 18, 2019 2:45 pm

Thanks dfan and Bill.

I was mainly wondering if an estimated 2 point win was really a 4 point difference with a few points of slack built in. That appendix seems to put my concern to rest.

Life In 19x19

Score mean versus probability

Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability

Re: Score mean versus probability