Confirmation bias in neural nets?

For discussing go computing, software announcements, etc.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: Confirmation bias in neural nets?

Post by Uberdude »

Bill Spight wrote:Back when I was considering such things, it was plain that a good player needed to be able to refute bad moves, but there are some bad moves that a good opponent will never play, and so the ability to refute them may not be learned
There was a nice example of this I posted about a while ago, I think from a neural network bot shortly before AlphaGo came along. Said bot was doing well in a fight and had trapped some key cutting stones with a potential crane's nest tesuji (they had the 3 liberties and opponent 2 extensions on the side). The other bot opponent then played the one-point jump to escape everyone stronger than 20 kyu knows is doomed to fail. Neural bot wedged, other bot atarid, neural bot connected instead of the squeeze and woopsy game over. A little MCTS reading would have saved the day, but presumably playing out the doomed crane's nest tesuji is so rare in the strong player games used for training the neural network hadn't learned how to refute it.

Also with exploration vs exploitation there is a conflict in what is needed in different situations. We see blindspots of bots not considering moves even 1 ply deep which is due to insufficient exploration. But to read ladders you want high exploitation and low exploration to quickly go deep down the 1 relevant variation.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: Confirmation bias in neural nets?

Post by moha »

Uberdude wrote:But to read ladders you want high exploitation and low exploration to quickly go deep down the 1 relevant variation.
This assumes that the opponent ('s policy view) will also continue a losing ladder, which in turn would mean that the net has high policy on ladder moves regardless of them working or not. This is a failure for the net (could do better), and is also unlikely from training point of view, since the net is trained towards the result of a search, which will not be the ladder move if that doesn't work (50% of times).
iopq
Dies with sente
Posts: 113
Joined: Wed Feb 27, 2019 11:19 am
Rank: 1d
GD Posts: 0
Universal go server handle: iopq
Has thanked: 11 times
Been thanked: 27 times

Re: Confirmation bias in neural nets?

Post by iopq »

Leela Zero never seems to converge to the higher win percentage node, even after thousands of playouts it has 0.3% higher win rate. You'd think it would give it a chance. But it stupidly prefers to exploit the move it chose right away and just search there a little bit more.

This didn't seem to occur in pure MCT bots, they would eventually switch to the ever so slightly better option and exploit it instead
Post Reply