Confirmation bias in neural nets?

Uberdude · Post by **Uberdude** » Wed Feb 20, 2019 3:14 am

Bill Spight wrote:Back when I was considering such things, it was plain that a good player needed to be able to refute bad moves, but there are some bad moves that a good opponent will never play, and so the ability to refute them may not be learned

There was a nice example of this I posted about a while ago, I think from a neural network bot shortly before AlphaGo came along. Said bot was doing well in a fight and had trapped some key cutting stones with a potential crane's nest tesuji (they had the 3 liberties and opponent 2 extensions on the side). The other bot opponent then played the one-point jump to escape everyone stronger than 20 kyu knows is doomed to fail. Neural bot wedged, other bot atarid, neural bot connected instead of the squeeze and woopsy game over. A little MCTS reading would have saved the day, but presumably playing out the doomed crane's nest tesuji is so rare in the strong player games used for training the neural network hadn't learned how to refute it.

Also with exploration vs exploitation there is a conflict in what is needed in different situations. We see blindspots of bots not considering moves even 1 ply deep which is due to insufficient exploration. But to read ladders you want high exploitation and low exploration to quickly go deep down the 1 relevant variation.

moha · Post by **moha** » Wed Feb 20, 2019 3:51 am

Uberdude wrote:But to read ladders you want high exploitation and low exploration to quickly go deep down the 1 relevant variation.

This assumes that the opponent ('s policy view) will also continue a losing ladder, which in turn would mean that the net has high policy on ladder moves regardless of them working or not. This is a failure for the net (could do better), and is also unlikely from training point of view, since the net is trained towards the result of a search, which will not be the ladder move if that doesn't work (50% of times).

iopq · Post by **iopq** » Wed Feb 27, 2019 11:25 am

Leela Zero never seems to converge to the higher win percentage node, even after thousands of playouts it has 0.3% higher win rate. You'd think it would give it a chance. But it stupidly prefers to exploit the move it chose right away and just search there a little bit more.

This didn't seem to occur in pure MCT bots, they would eventually switch to the ever so slightly better option and exploit it instead

Life In 19x19

Confirmation bias in neural nets?

Re: Confirmation bias in neural nets?

Re: Confirmation bias in neural nets?

Re: Confirmation bias in neural nets?