Deep learning and ladder blindness

Bill Spight · **#21**

chut wrote:

Life/death of a group and ladder are related problems. We still see mightily strong bots fall flat on situations that are obvious to human. These are monkey wrench to the tree search because the minimax evaluation of a whole branch can become invalid if we push the search a bit deeper. That means there is an inherent uncertainty to the evaluated win rate of a branch.

Well, yes, by design. Win rates other than 100% or 0% depend upon errors in play. But what errors? Has anybody published anything about winrate errors?

Quote:

It does seem to me that we can't escape having meta level tree search guidance system. That is probably worthy of a deep learning project.

Can't what we already have be described as a meta level tree search guidance system? Tsumego, ladder reading, semeai, etc., involve local searches (although they may cover the whole board). Humans can integrate the results of local searches. Monte Carlo Tree Search is global. Before MCTS came along, I know that Martin Mueller experimented with having computer integration of local searches. MCTS was wildly successful, though, and dominates current thinking.

Humans typically evaluate games by estimating territory, and territory estimates adapt easily to different komis. I think that people are still experimenting with them instead of winrates, but before the advent of AlphaGo winrates worked better with MCTS. After all, the aim is to win the game, not to win it by a larger margin. After AlphaGo you get the bullshit about how humans can't think in terms of probabilities. Well, yes, you either want the probability of winning the game, or, using fuzzy logic, the degree to which the current position, plus having the move, belongs to the set of won games. Fuzziness and probability are different kinds of uncertainty.

In theory, a territory estimate is not in general enough to estimate whether a game is won or not. You also need a parameter called temperature. The temperature at the start of the game is around 14 pts., and it diminishes, with ups and downs, over the course of the game. It is possible to estimate it. And the global temperature is the same as the maximum of the local temperatures. (Although we may want to use a different definition which is insensitive to temporary increases.) Territory and temperature are parameters that may enable us to combine local searches. Furthermore, it is in theory possible to utilize the estimates of temperature and territory in combination to come up with a fuzzy estimate of winning or losing. In fact, currently we can say that if Black is ahead by an estimated 12 pts. on the board, with a 7.5 komi, and both players play perfectly, Black has a nearly won game, even if White has the move. (My estimate is better than 80% won, which is not an 80% winrate, BTW.) OC, nobody plays perfectly, so we still need to develop error measures. (Note that, unlike probability, fuzziness produces uncertainty even if we assume perfect play.)

However, the theory behind MCTS is already well developed. Fuzzy logic has proven itself in other applications, and we may see a theoretical or practical breakthrough in the future which will allow us to apply fuzzy logic to build better go bots.

Edited for correctness and clarity.

Mike Novack · **#22**

There is more to this than "ladder blindness". Go is difficult. The problem of apparent blindness to the consequences of EXISTING ladders (that seem oh so obvious to much weaker human players) ignores that there is a more general problem, the consequences of POTENTIAL ladders, and those are not at all obvious to weaker human players.

Ladders are "in play" in games between strong human players even though we do not see those ladders manifest in the games. The point I am making is the fact that some local situation might be good or bad (depending on who would win a ladder beginning there) makes sente any number of moves along the route of that potential ladder. Moves that might be minor local losses in this other local area.

A human learning go is doing it "step by step". The beginner learns about ladders, how to determine the outcome based on stones already on the board. Only much later comes learning about how the threat of a ladder makes remote moves sente. The neural net learning "from zero" has to learn to solve the entire problem all at once.

In "problem solving" a useful skill is being able to figure out where a complex problem can be usefully broken into component parts. Humans who are good at problem solving are good at this.

Deep learning and ladder blindness

Who is online