Re: This 'n' that
Posted: Fri Jul 21, 2017 12:05 pm
Well, it has been a while, and this may be a slight digression, but it is about AlphaGo.
I'll get back to my own theme later.
I don't think that we can attribute that tendency to Monte Carlo randomness. Pure Monte Carlo sucks. Monte Carlo Tree Search (MCTS) is very good, as we know. But I think — and I would be happy to be corrected — that the combination of Neural Networks with MCTS is a hybrid. The neural nets are not just modifications of MCTS to make it better, they are a different approach, and each approach gains from the other.
This "flitting about", I suspect, has two main reasons. First, AlphaGo does not make plans. Yes, it builds a search tree, but that is not the same thing. Znosko-Borovsky advised humans to make plans (in chess) and not to try to try to find the best play. AlphaGo tries to find the best play. It may randomize among plays that have nearly the same evaluations, but its evaluation is based in part on randomness, so maybe not. (Uberdude, maybe this is what you had in mind with Monte Carlo randomness.
) But even if its evaluation is not random at all, I think that we would get the same effect through errors. Any evaluation function in the opening will be uncertain.
Second, I think that the tenukis reflect the nature of the opening. There are relatively few sente in the opening. Let me give one of my favorite illustrations. At the beginning of modern thinking about the opening, a few centuries ago, Black would play on a 3-4 point, White would make the 5-3 approach, and Black would make a pincer, usually one or two spaces. Then White would play in a different corner. This was actually a pretty sophisticated idea by White. White could be satisfied with having prevented an enclosure with sente. After a while, Black stopped making the pincer, but played in a different corner himself. The approach was not sente. Eventually White stopped playing the approach on move 2, although it has not entirely died out.
Consider the situation after the Black pincer at move 3. The pincered stone has no base and is subject to further attack. Doesn't protecting it have some urgency? In the current vernacular, isn't that corner hotter than an open corner? The answer is no, and the ancients knew it. So does AlphaGo. As I have explained before, in general, as stones are added to a region of the board, the local temperature drops. OC, there are sente, and the local temperature might not drop if the stones were played randomly, but they are not. Stones of the same color are played to help each other, or to work together. This coordination tends to strengthen the stones and thus to lower the temperature. There are hotter plays elsewhere.
Now, this is something I have known for quite some time, hence my "proverb", Tenuki is always an option.
But AlphaGo has surprised me. For instance,
The
stones are strong, and so I have believed that the local temperature has dropped, and often tenuki. However, AlphaGo usually does not tenuki now, but connects the
stones.
AlphaGo usually makes the solid connection and tenukis after
. But the
stones do not have a base, nor do they have any eye shape. Don't they need to extend to "a" or something? Apparently not.
(Although sometimes AlphaGo does make an extension.)
This sequence, playing
and then tenuki, shows that AlphaGo's tenukis are not the result of randomness. If we find them hard to understand, I submit that the fault, dear Brutus, is in ourselves. 
This quote is from Uberdude's study journal ( https://www.lifein19x19.com/viewtopic.p ... 74#p221574 ) commenting on AlphaGo self-play game 38.Uberdude wrote:AlphaGo seems to have a tendency to flit around the board, making exchanges at what may appear random times. Is there some meaning behind these moves: in that they are probes and depending on how one is answered play will resume at another place in a different way, or are they just Monte-Carlo style random exchanges which could have been made another time?
I don't think that we can attribute that tendency to Monte Carlo randomness. Pure Monte Carlo sucks. Monte Carlo Tree Search (MCTS) is very good, as we know. But I think — and I would be happy to be corrected — that the combination of Neural Networks with MCTS is a hybrid. The neural nets are not just modifications of MCTS to make it better, they are a different approach, and each approach gains from the other.
This "flitting about", I suspect, has two main reasons. First, AlphaGo does not make plans. Yes, it builds a search tree, but that is not the same thing. Znosko-Borovsky advised humans to make plans (in chess) and not to try to try to find the best play. AlphaGo tries to find the best play. It may randomize among plays that have nearly the same evaluations, but its evaluation is based in part on randomness, so maybe not. (Uberdude, maybe this is what you had in mind with Monte Carlo randomness.
Second, I think that the tenukis reflect the nature of the opening. There are relatively few sente in the opening. Let me give one of my favorite illustrations. At the beginning of modern thinking about the opening, a few centuries ago, Black would play on a 3-4 point, White would make the 5-3 approach, and Black would make a pincer, usually one or two spaces. Then White would play in a different corner. This was actually a pretty sophisticated idea by White. White could be satisfied with having prevented an enclosure with sente. After a while, Black stopped making the pincer, but played in a different corner himself. The approach was not sente. Eventually White stopped playing the approach on move 2, although it has not entirely died out.
Consider the situation after the Black pincer at move 3. The pincered stone has no base and is subject to further attack. Doesn't protecting it have some urgency? In the current vernacular, isn't that corner hotter than an open corner? The answer is no, and the ancients knew it. So does AlphaGo. As I have explained before, in general, as stones are added to a region of the board, the local temperature drops. OC, there are sente, and the local temperature might not drop if the stones were played randomly, but they are not. Stones of the same color are played to help each other, or to work together. This coordination tends to strengthen the stones and thus to lower the temperature. There are hotter plays elsewhere.
Now, this is something I have known for quite some time, hence my "proverb", Tenuki is always an option.
The
AlphaGo usually makes the solid connection and tenukis after
This sequence, playing