Was the ear-reddening move a divine move?

pnprog · Post by **pnprog** » Sat Oct 20, 2018 5:55 am

But aren't all those old games without komi? If so, asking for those bots, trained for 7.5pt komi, to evaluate board positions or moves, does not seems like a good idea to me.

Leon · Post by **Leon** » Sat Oct 20, 2018 9:18 pm

pnprog wrote:But aren't all those old games without komi? If so, asking for those bots, trained for 7.5pt komi, to evaluate board positions or moves, does not seems like a good idea to me.

There is a LZ version that supposedly works around this and works for 0 komi or high handicap games: https://old.reddit.com/r/cbaduk/comment ... lz_became/

Ubderdude used this for his analysis on the move

John Fairbairn · Post by **John Fairbairn** » Sun Oct 21, 2018 3:24 am

Could someone usefully comment on what I regard as slightly unusual behaviour by Lizzie/LZ (I am using rounded figures for simplicity):

1. White starts with a projected win ratio of 54%.

2. If Black passes for Move 1, the White win ratio goes up to 77%

3. If Black plays on the 19-19 point on Move 1, the White win ratio goes up to 84%. (And Black moves on the edge give around 76%.)

That seems to say, passing is worth more to Black than playing certain moves on the board. Counter-intuitive?

dfan · Post by **dfan** » Sun Oct 21, 2018 7:08 am

John Fairbairn wrote:Could someone usefully comment on what I regard as slightly unusual behaviour by Lizzie/LZ (I am using rounded figures for simplicity):

1. White starts with a projected win ratio of 54%.

2. If Black passes for Move 1, the White win ratio goes up to 77%

3. If Black plays on the 19-19 point on Move 1, the White win ratio goes up to 84%. (And Black moves on the edge give around 76%.)

That seems to say, passing is worth more to Black than playing certain moves on the board. Counter-intuitive?

My interpretation is that a stone played on the corner point is more likely to be dead by the end of the game than not. (I guess that even if Black ends up trying to take that corner, it is likely to hurt eye-making.)

It is also worth noting that the more unusual a position, the less LZ's intuition is to be trusted, since it has been trained on fewer similar positions. So that 84% is probably even less precise than usual.

Tryss · Post by **Tryss** » Sun Oct 21, 2018 7:19 am

And there's the possibility that the network is sensitive to some very specific small perturbations (this added stone is the perturbation).

This article (about neural network in image classification) show something very interesting :

Abstract

Recent research has revealed that the output of Deep Neural Networks (DNN) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution. It requires less adversarial information and can fool more types of networks. The results show that 68.36% of the natural images in CIFAR-10 test dataset and 41.22% of the ImageNet (ILSVRC 2012) validation images can be perturbed to at least one target class by modifying just one pixel with 73.22% and 5.52% confidence on average. Thus, the proposed attack explores a different take on adversarial machine learning in an extreme limited scenario, showing that current DNNs are also vulnerable to such low dimension attacks.

https://arxiv.org/pdf/1710.08864.pdf

I think our neural net are less sensitive to this, but these kind of things can happens

Bill Spight · Post by **Bill Spight** » Sun Oct 21, 2018 7:52 am

Tryss wrote:And there's the possibility that the network is sensitive to some very specific small perturbations (this added stone is the perturbation).

This article (about neural network in image classification) show something very interesting :

Abstract

Recent research has revealed that the output of Deep Neural Networks (DNN) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution. It requires less adversarial information and can fool more types of networks. The results show that 68.36% of the natural images in CIFAR-10 test dataset and 41.22% of the ImageNet (ILSVRC 2012) validation images can be perturbed to at least one target class by modifying just one pixel with 73.22% and 5.52% confidence on average. Thus, the proposed attack explores a different take on adversarial machine learning in an extreme limited scenario, showing that current DNNs are also vulnerable to such low dimension attacks.
https://arxiv.org/pdf/1710.08864.pdf

I think our neural net are less sensitive to this, but these kind of things can happens

Back in the 90s I came up with the idea of testing evaluation functions by comparing two positions which differ at only one point on the board, or by moving a stone to a neighboring empty point (based upon the concept in psychology of just noticeable differences). In go the evaluation function (or neural network) should be very sensitive to some such perturbations, insensitive to others.

Edit: OC, the paper concerns perturbations that are unnoticeable to humans.

Bill Spight · Post by **Bill Spight** » Sun Oct 21, 2018 8:14 am

John Fairbairn wrote:That seems to say, passing is worth more to Black than playing certain moves on the board. Counter-intuitive?

Not to me. Some moves are bad, some moves are very bad.

And what's worse on the empty board than the 1-1?

John Fairbairn wrote:Could someone usefully comment on what I regard as slightly unusual behaviour by Lizzie/LZ (I am using rounded figures for simplicity):

Given Lizzie's margin of error, rounding to the nearest integer is correct.

Kirby · Post by **Kirby** » Sun Oct 21, 2018 9:13 am

Maybe this was answered in a different thread, but practically speaking, what does the difference in win percentage mean, concretely?

That is to say, if black is winning with 65% win rate, I interpret that as a board where black can win. If the rate is 75%, how does it differ in a practical sense? Blaknis still going to win, right?

dfan · Post by **dfan** » Sun Oct 21, 2018 9:27 am

Kirby wrote:Maybe this was answered in a different thread, but practically speaking, what does the difference in win percentage mean, concretely?

That is to say, if black is winning with 65% win rate, I interpret that as a board where black can win. If the rate is 75%, how does it differ in a practical sense? Blaknis still going to win, right?

The big relevant discussion is here, but the one-sentence version is that basically if Leela Zero produces a win rate of 65% in some position, it means that Leela Zero thinks that Black would win 65% of the time if it played itself starting from that position. (If you don't like probabilities for bot play, you can turn this into statements about the money odds it would want in order to place a bet on Black or White.)

Of course you can argue about the precise semantics of pretty much every word in that sentence, but that's the intent.

Regarding your second question, it is true that to God every* position should have an evaluation of either 0% or 100%, but bots are not as strong as that.

Bill Spight · Post by **Bill Spight** » Sun Oct 21, 2018 9:47 am

Kirby wrote:Maybe this was answered in a different thread, but practically speaking, what does the difference in win percentage mean, concretely?

That is to say, if black is winning with 65% win rate, I interpret that as a board where black can win. If the rate is 75%, how does it differ in a practical sense? Blaknis still going to win, right?

Nope.

As dfan points out, the winrate is an estimation of how often the bot will win playing against itself many times from the current position or with a certain play. There are unstated assumptions about the constraints on the play, such as time limits.

Let us say that the estimate of a 75% Black winrate is correct. Then, as a number of self-play games continue from the current position or with the given play, 75% of the time the winrate estimate should approach 100% (with fluctuations, OC), until finally Black wins, and 25% of the time the winrate estimate should approach 0%, until Black loses. Given that humans are weaker than bots, we should estimate winrates for humans that are closer to 50%, unless we know better.

yakcyll · Post by **yakcyll** » Sun Oct 21, 2018 9:57 am

I think the key question is how are specific playouts different from one another? Is it just a matter of following different branches at each level with different proportions? Obviously Leela's isn't flawless and it doesn't produce binary results, so there has to be some level of randomness incorporated in the process.

lightvector · Post by **lightvector** » Sun Oct 21, 2018 10:06 am

Since again there seems to be confusion about this, relating mostly to oversimplifications of the relevant ideas, here's an attempt to describe succinctly and accurately without any of those usual easy-to-misinterpret oversimplifications.

65% win rate means that following what the bot considers to be likely good play by both sides over the next few moves¹, on average the resulting positions are ones that the neural net believes are "similarly good" to positions it's seen in the training data where the player-to-move in that data won about 65% of the time². The training data usually consists of a slightly old-and-weaker version of that bot playing itself many times using a certain fixed number of playouts³, and with much heavier randomization than normal⁴.

¹Of course, occasionally the bot's reading may entirely be overlooking a good move by one or both sides.
²But still limited by the neural net's ability to understand and compare those positions. Larger nets will on average have better understanding. But they can still massively blunder/misjudge from time to time.
³In particular, this means that the 65% is NOT an estimate of how likely this version would be to win with the potentially very different number of playouts that you are running it with.
⁴More randomization provides the neural net with richer and more varied training data to learn from, but also means that the bot in the training data is much more likely to blunder than normal, which of course also affects the win % just like the other things mentioned in (³).

Tryss · Post by **Tryss** » Sun Oct 21, 2018 12:00 pm

And my feeling is that we can somehow describe the percentage of LZ (but not elf) in human terms with something like this :

45-55% : equal game
55-65% : slight advantage
65-75% : noticeable advantage
75-85% : huge advantage
over 85% : won game (LZ will resign under 10%)

Also, note that the winrate are not independant during the same game, so if you want to test the accuracy of LZ prediction, you need to fix a move number, then test the distribution

For exemple, if we want to know if LZ predictions are accurate for a category of players (for exemple pro players, or low dan on internet), an experiment could be like this :

Take 1000 games of these players that didn't resign before move 120, evaluate for each the position at move 80, group the games by 5% LZ predictions, and see if the results are close to the predictions.

After that, we could also verify the "internal consistency" : take two data points for each games, and see how the predictions correlate with the true result. Are the earlier estimate more reliable than the second?

I may try to do a python program like this and test it on something like 100 games of strong sdk with 5 predictions group (0-15, 15-40, 40-60, 60-85, 85-100). (but no garantee)

Kirby · Post by **Kirby** » Sun Oct 21, 2018 12:47 pm

lightvector wrote:Since again there seems to be confusion about this, relating mostly to oversimplifications of the relevant ideas, here's an attempt to describe succinctly and accurately without any of those usual easy-to-misinterpret oversimplifications.

65% win rate means that following what the bot considers to be likely good play by both sides over the next few moves¹, on average the resulting positions are ones that the neural net believes are "similarly good" to positions it's seen in the training data where the player-to-move in that data won about 65% of the time². The training data usually consists of a slightly old-and-weaker version of that bot playing itself many times using a certain fixed number of playouts³, and with much heavier randomization than normal⁴.

¹Of course, occasionally the bot's reading may entirely be overlooking a good move by one or both sides.
²But still limited by the neural net's ability to understand and compare those positions. Larger nets will on average have better understanding. But they can still massively blunder/misjudge from time to time.
³In particular, this means that the 65% is NOT an estimate of how likely this version would be to win with the potentially very different number of playouts that you are running it with.
⁴More randomization provides the neural net with richer and more varied training data to learn from, but also means that the bot in the training data is much more likely to blunder than normal, which of course also affects the win % just like the other things mentioned in (³).

Thanks for the succinct explanation - and to others who have similarly added to the discussion.

Given this explanation, could you elaborate on what is happening when the win rate is changing as the number of playouts increases? E.g. if I let LZ sit there, the percentages start to change. Obviously, the training data hasn't changed, so something about the playouts happening right now are affecting the win rate, right? Is it just that it's finding different board positions a few moves ahead that match up to newly found positions that are similar to different training data positions (thereby adjusting the probability)?

Thanks.

hyperpape · Post by **hyperpape** » Sun Oct 21, 2018 1:04 pm

Tryss wrote:I think our neural net are less sensitive to this, but these kind of things can happens

I don't know if you meant go programs by "our neural nets", but in a very real sense, our human neural nets are similarly vulnerable, as a recent paper apparently demonstrated: https://spectrum.ieee.org/the-human-os/ ... ial-images.

Life In 19x19

Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?

Re: Was the ear-reddening move a divine move?