Zen beats pro with 2 stones handicap, another at EGC?

For discussing go computing, software announcements, etc.
Mike Novack
Lives in sente
Posts: 1045
Joined: Mon Aug 09, 2010 9:36 am
GD Posts: 0
Been thanked: 182 times

Re: Zen beats pro with 2 stones handicap, another at EGC?

Post by Mike Novack »

"- Go programs do fairly poorly if they end up behind in even or low-handicap games because the wins they perceive are from mistakes in the playouts that are likely to be significantly below the level of play of the opponent. The distribution of mistakes a human will make falls off far more sharply than that of a playout as the mistake gets simpler and more extreme. So if a program "sees" no way to win except to try to elicit a clear mistake from the opponent, then since it has no good model of what mistakes are likely and perceives many extraordinarily unlikely mistakes to be merely somewhat unlikely, it will try to elicit those extraordinarily unlikely mistakes and do things that are easy to refute. And the more extreme a mistake it sees as needed, the more trivial and easy to refute. Modeling and exploiting the distribution of mistakes an opponent is likely to make in reality is a very hard problem and is, as far as I know, unsolved."

I think this misses the real point. Inability to evaluate when to resign. The program is resigning when sees that it cannot pull the game out EVEN if the opponent makes a mistake (when ALL mistakes, rare ones as well as silly ones become insignificantly unlikely).

So yes "Modeling and exploiting the distribution of mistakes an opponent is likely to make in reality is a very hard problem and is, as far as I know, unsolved." ESPECIALLY as this modelling would have to be different for every possible level of the opponent's strength. It is NOT just the position but the judged strength of the opponent. You should resign that position if playing against a 3 dan but not against a 10 kyu because the latter might make a mistake that would be inconceivable for the 3 dan.

Go back to what some of us are saying. Take another look at those games at the point where the program began making silly moves. Would a human player, against an opponent of that level, make another move or would he or she resign? << and here I do NOT include a few forcing moves made to have time to verify the count and read out any remaining issues >> If the "right" move is resign (0% of winning the game) don't fault the program for making a silly move that has only a 0.01% chance of winning, because the latter IS "the better move" except for the annoyance of the human opponent.
lightvector
Lives in sente
Posts: 759
Joined: Sat Jun 19, 2010 10:11 pm
Rank: maybe 2d
GD Posts: 0
Has thanked: 114 times
Been thanked: 916 times

Re: Zen beats pro with 2 stones handicap, another at EGC?

Post by lightvector »

Mike, I suspect you're perceiving disagreement or argument where there isn't any? We're simply talking about different regimes. In games against similarly-strong opponents:

* When only slightly behind, at least in my observation is that modern Go programs usually actually do okay in upping the tension a little and keeping things unstable and complex. I believe this often happens in the range where MCTS is seeing playouts have win rates of around 40%-50%. Note that the real chance of winning the game will generally be lower than that - playouts are a bit noisy.

* When somewhat more behind and where winning is unlikely but still not out of the question, Go programs frequently do more poorly, in the way I described in the last post. I think this often corresponds to when playout win rates are around 30%-45% and where the real chance of winning the game would with good active play and complication be say, 5%-20%.

* When even more behind and where resigning becomes sensible, you get the regime that Mike is referring to. This is where playouts are reporting win rates below 30%, and the real chance to win really starts to approach zero. In that case, yes, there's no faulting what the program does in a dead lost position where nothing works and yet you've told it to try to find a way to actually win.

Obviously, the ranges I mentioned are approximate and depend on the program, the kinds of positions on the board in that specific game, etc.

Interestingly, despite no real advances in targeted forms of opponent modeling, the issue has become somewhat less major as Go programs have developed better move selection - early on in some of Remi Coulom's and many other developers' work in feature-and-shape-based move prediction, and more recently via convolutional neural nets, such as AlphaGo's policy net and some of the nets that are no doubt being worked on in the latest versions of Crazystone and Zen. This is because a move selection engine trained to try to approximately mimic the kinds of moves and shapes that strong players actually play simply won't suggest many kinds of stupid moves that the program would otherwise want to try. Also there are other techniques like "dynamic komi" that are crude but mitigate the problem and that have often seen decent results in mid-to-high-handicap games. None of these have really fixed the underlying problem, but they've taken the lot of the edge off.
Post Reply