O Meien on AlphaGo Zero

John Fairbairn · Post by **John Fairbairn** » Thu Jan 25, 2018 11:38 am

I have just been reading O Meien's views on AlphGo Zero. He has long had an interest in computers and is of course a top pro, so his views should carry some weight.

The main thing he has noticed is that AGZ is "good at living." He observes that it makes many ordinary moves when attacked but has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space. These unusual moves, says, O, are moves that normally look bad. (He doesn't spell it out, but the inference seems to be that human pros have blind spots about such moves.)

Because of this ability, it can invade the opponent's sphere of influence with impunity. This makes it different from pre-Zero AG which tended to favour large scale surrounding attacks and only rarely got itself into shinogi situations.

There were plenty of other things he commented on, but this particular seems especially significant to me because it seems to suggest AG is improving by becoming more and more of a tactics calculating machine. Maybe it hasn't got that much more to tell us about go strategy?

moha · Post by **moha** » Thu Jan 25, 2018 4:00 pm

I have the feeling the influence <> territory balance changes with strength, a bit like komi. The stronger the player is, the less value is in large moyos, since the (same strength) opponent will be able to reduce more effectively. AGZ probably takes this to the extreme, that's why it is more territorial than Master. And since it no longer needs rollouts, it can search much faster, thus presumably deeper as well. The few published games show incredible accuracy, the good invasion and living techniques seem a direct consequence of that.

Bill Spight · Post by **Bill Spight** » Thu Jan 25, 2018 4:53 pm

I am leery of generalizing from a single instance. There is no guarantee that another Alpha Zero neural net, with a different training history -- perforce! -- would have the same characteristics as the current AlphaGo Zero. Also, in a few years we will have other bots who are as strong, and we shall see how they play, as well.

jeromie · Post by **jeromie** » Thu Jan 25, 2018 10:32 pm

I feel like strategy and tactics are so closely intertwined that an advance in one will necessarily lead to a change in the other. While we may not be able to learn the sort of principles that can be communicated via an aphorism, if humans can address the tactical blind spots revealed by AlphaGo's play strategic changes will eventually follow. Staking out a large territory isn't viable if your opponent can destroy it.

And I do think we will learn something from seeing precise (or even merely odd to us) tactical play. While humans may never play with the same unwavering acuity as a computer, even knowing that tactical advances are possible will encourage top players to stretch their limits ever farther.

Uberdude · Post by **Uberdude** » Fri Jan 26, 2018 9:04 am

Just a quick comment on styles of strong bots: Zen (the non-released version is top pro level) hasn't started doing early 3-3 invasions like AlphaGo, and later FineArt (iirc), DolBaram and LeelaZero do. When its opponents do them against it Zen is often happy to keep extending as the opponent crawls on the 2nd line and make the gote wall (though it does sometimes jump instead of hane). Zen also likes to split sides of the opponent (e.g. between 4-4 and a shimari) which AlphaGo is noticeable in not liking. So it seems to still play in a more traditionally human style. In some ways it actually seems more human now with the neural networks than the pure MCTS pre-AlphaGo, when it was famous for liking the centre a lot, constructing large moyos with weird central moves and then making spectacular kills. Sometimes it will still go for centre moyos, but it feels to me less biased towards it, and adapts to the circumstances, happy to go for territory if that's a good way too. It does still like shoulder hits though, as does AG.

sorin · Post by **sorin** » Fri Jan 26, 2018 9:51 am

John Fairbairn wrote:I have just been reading O Meien's views on AlphGo Zero. He has long had an interest in computers and is of course a top pro, so his views should carry some weight.

The main thing he has noticed is that AGZ is "good at living." He observes that it makes many ordinary moves when attacked but has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space. These unusual moves, says, O, are moves that normally look bad. (He doesn't spell it out, but the inference seems to be that human pros have blind spots about such moves.)

Because of this ability, it can invade the opponent's sphere of influence with impunity. This makes it different from pre-Zero AG which tended to favour large scale surrounding attacks and only rarely got itself into shinogi situations.

There were plenty of other things he commented on, but this particular seems especially significant to me because it seems to suggest AG is improving by becoming more and more of a tactics calculating machine. Maybe it hasn't got that much more to tell us about go strategy?

Very interesting! I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions? Although, I do remember such a case also from the game AG played against the Chinese team ("consultation go") in Wuzhen, specifically move 60: http://www.alphago-games.com/view/event ... /3/move/60 - more of a bad-shape tesuji than a life-and-death situation, but nevertheless it led to a surprisingly quick escape from what seemed at first a severe attack from the humans' team.

As for "Go strategy" - what is strategy? Is it not just humans attaching words to situations that seem mysterious just because they are way beyond our reading/tactical ability?
If it turns out that one can live in much tighter spaces than pros currently think they can, the general way to play early in the game ("strategy") will change a lot, so AG is teaching us a lot about strategy, I think.

Uberdude · Post by **Uberdude** » Fri Jan 26, 2018 10:44 am

sorin wrote: I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions?

This doesn't quite fit "has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space" but the sequence from move 48-60 in this AGZ 20-block vs AG Lee game is one of my favourites http://www.alphago-games.com/view/event ... /3/move/48

Click Here To Show Diagram Code: [go]$$B Black moyo? $$ +---------------------------------------+ $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . O O O . . . . . . . . . . . . | $$ | . O O O X X X X . . . . . . . . . . . | $$ | . X X X . . . . . , . . . . . O . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . . . . . . X . . . . . . . . . . . | $$ | . . X . X X . . . O . . . . . . . . . | $$ | . X . O . O X . O . . . . . . . . . . | $$ | . . . O . O . . . . X . . . . O . . . | $$ | . . X X . . . . O . . . . . . . . . . | $$ | . . X O O . O . . X . . . X . O . . . | $$ | . . X X O . . X . . X . . X O . . . . | $$ | . . . . . . . . . . . O . O . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ +---------------------------------------+[/go]

Click Here To Show Diagram Code: [go]$$B Moyo schmoyo, white territory! $$ +---------------------------------------+ $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . O O O . . . . . . . . . . . . | $$ | . O O O X X X X . . . . . . . . . . . | $$ | . X X X X . . . . , . . . . . O . . . | $$ | . . . . O . O X . . . . . . . . . . . | $$ | . . O . . . . X . . . . . . . . . . . | $$ | . . . . . . O X . . . . . . . . . . . | $$ | . . O . . . . . . . . . . . . . . . . | $$ | . . . . . . O X . . . . . . . . . . . | $$ | . . . O . . . . . , . . . . . , . . . | $$ | . . . . . . . X . . . . . . . . . . . | $$ | . . X X X X . . . O . . . . . . . . . | $$ | . X . O . O X . O . . . . . . . . . . | $$ | . . . O . O . . . . X . . . . O . . . | $$ | . . X X . . . . O . . . . . . . . . . | $$ | . . X O O . O . . X . . . X . O . . . | $$ | . . X X O . . X . . X . . X O . . . . | $$ | . . . . . . . . . . . O . O . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ +---------------------------------------+[/go]

P.S. something I wonder is should Black 45 have connected solidly against the peep? That would allow white to jump to the 2nd line which separates the corner and destroys some territory, but the corner is safe and white makes no eyes. That is thicker and takes away one sente move white used to live inside so that would be harder, would white still go in so deep or reduce more gently?

sorin · Post by **sorin** » Fri Jan 26, 2018 11:40 am

Uberdude wrote:
sorin wrote: I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions?
This doesn't quite fit "has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space" but the sequence from move 48-60 in this AGZ 20-block vs AG Lee game is one of my favourites http://www.alphago-games.com/view/event ... /3/move/48

This sequence is absolutely amazing indeed - it almost looks like it is made up

johnsmith · Post by **johnsmith** » Fri Jan 26, 2018 1:50 pm

sorin wrote:
Uberdude wrote:
sorin wrote: I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions?
This doesn't quite fit "has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space" but the sequence from move 48-60 in this AGZ 20-block vs AG Lee game is one of my favourites http://www.alphago-games.com/view/event ... /3/move/48
This sequence is absolutely amazing indeed - it almost looks like it is made up

One of my favorites as well! There seems to be missing a black stone at h3

Uberdude · Post by **Uberdude** » Fri Jan 26, 2018 2:11 pm

johnsmith wrote: There seems to be missing a black stone at h3

Fixed, thanks.

djhbrown · Post by **djhbrown** » Fri Jan 26, 2018 4:56 pm

Bill Spight wrote:I am leery of generalizing from a single instance. There is no guarantee that another Alpha Zero neural net, with a different training history -- perforce! -- would have the same characteristics as the current AlphaGo Zero.

As Alexander Dumas remarked, any generalisation is dangerous, so it is dangerous to generalise that generalising from a single instance offers no guarantee of same characteristics; in this case, for one simple reason:
1. Alfie0 has put Monte-Carlo where it belongs, in the wastepaper basket, and with no random element to her, and no change to the microworld of Go, there is no reason to believe that she wouldn't tread the same path and end up looking the same if you were to crank her up again from the beginning, digging another hole in the same place and expecting a different result.

Because the world of Go is so well-defined, and so restricted, there is every reason to believe that Go is axiomatisable - the broad approach of Swim - that's to say, there are universal truths of Go that can be established by logical deduction within a domain model - the sort of thing that Russel and Whitehead tried to do for arithmetic.

Alfie0's behaviour is so markedly similar to that of the ancient greats that there's a fair chance both she and they have started to uncover what those truths are, one of which would be that Alfie "Master"(sic) is a Sorcerer's Apprentice, wrong about almost everything almost all of the time

Of course, a different DCNN configuration (eg with more layers and/or an improved learning algorithm) would indeed tread a different path and might end up on top. Even so, my own gutfeel is that it wouldn't have a different style to Alfie0, just superior reading.

Thumbs up for Alfie0.

Tryss · Post by **Tryss** » Fri Jan 26, 2018 6:26 pm

1. Alfie0 has put Monte-Carlo where it belongs, in the wastepaper basket, and with no random element to her, and no change to the microworld of Go, there is no reason to believe that she wouldn't tread the same path and end up looking the same if you were to crank her up again from the beginning, digging another hole in the same place and expecting a different result.

I hope that you're aware that Alpha Zero use Monte-Carlo...

djhbrown · Post by **djhbrown** » Fri Jan 26, 2018 7:30 pm

Tryss wrote:I hope that you're aware that Alpha Zero use Monte-Carlo...

You hope in vain. their paper says - albeit not in black and white - that it doesn't.

"AlphaGo Zero is the program described in this paper. It learns from self-play reinforcement
learning, starting from random initial weights, without using rollouts." (my emphasis)

However, it's easy to be confused, as they go on to say:
"AlphaGo Zero is provided with perfect knowledge of the game rules. These are used during MCTS, to simulate the positions resulting from a sequence of moves, and to score any
simulations that reach a terminal state".

So as its senior author doesn't know the difference between tree search and random tree search, they should maybe have given the job of final proof reading to a different member of the team who does.

The reason Monte-Carlo is called Monte-Carlo is that it is based on roulette-like random state transitions. An upper confidence bound probabilistic search based solely on a heuristic move generator (the policy net) which embodies no random element has no random character and hence is not Monte-Carlo.

However, i overlooked that they say it starts out with random initial weights (which seems to me to be wholly unnecessary), so i take it all back.

Tryss · Post by **Tryss** » Fri Jan 26, 2018 9:50 pm

Page 14 of the Alpha zero preprint : Configuration : During training, each MCTS used 800 simulations.

djhbrown · Post by **djhbrown** » Fri Jan 26, 2018 10:34 pm

yes, the paper refers to MCTS in many places, one of which i quoted.

The thing is, Alfie0 doesn't do random rollouts - which i regard as a significant technological development, one which takes her away from making weird moves like Master et al, and takes her away from going "On Tilt", and makes her overall behaviour much closer to the received wisdom of sages down the ages.

I regard this as deeply significant; i see Alfie0 as qualitatively different and a huge step forward from Alfie Master. They are chalk and cheese. Sure, it's a small step for her programmer, but a giant leap forward for heuristic search, and for Go theory - even if it's also a (sensible!) step back to the good old days before Monte Python

.

Random rollouts are, to me, the hallmark of MCTS; its very essence.

Alfie0 differs from Alfie Fan and Alfie etc in that it doesn't do them.

Maybe PHTS (Probabilistic Heuristic Tree Search) would be a more accurate name than MCTS for the kind of search Alfie0 does.

p2 wrote:Our program, AlphaGo Zero, differs from AlphaGo Fan and AlphaGo Lee 12 in several important aspects. First and foremost, it is trained solely by self-play reinforcement learning, starting from random play, without any supervision or use of human data. Second, it only uses the black and white stones from the board as input features. Third, it uses a single neural network, rather than separate policy and value networks. Finally, it uses a simpler tree search that relies upon this single neural network to evaluate positions and sample moves, without performing any Monte-Carlo rollouts. To achieve these results, we introduce a new reinforcement learning algorithm that incorporates lookahead search inside the training loop, resulting in rapid improvement and precise and stable learning.

Life In 19x19

O Meien on AlphaGo Zero

O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

tautologies

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero

Re: O Meien on AlphaGo Zero