Kirby wrote:
Uberdude wrote:
Plus, isn’t the success of AlphaGo, LeelaZero etc evidence this approach works?
It's different. These AIs are not calculating winrate from a single localized position. They calculate from an entire board position.
I'm not talking about how, when running AG/LZ, it can give you a winrate for a whole board positon. I am talking about the training process: how does information about what moves are good and what moves are bad get fed into and update the neural networks? The basic premise of AG/LZ is:
- here's a game I played against myself
- pick a random move, say 67
- Did player of move 67 win this game?
- - Yes: 67 was more likely to be better than the average goodness of moves I play, update network to do more like it
- - No: 67 was more likely to be worse than the average goodness of moves I play, update network to do less like it
This only works if the assumption there was a causal link, however faint, between the goodness/badness of move 67 and whether or not the player of it won the game, is valid.
Kirby wrote:
If the local situation was good for black, but black died across the rest of the board, the winrate wouldn't be good.
Do you think it is more likely for black to have died somewhere else on the board when black descends at 'a', compared to when black pushes at 'b'? On the absence of any evidence to believe such a bias, my default position is to think those are equally likely.
Bayesian Bill to the rescue?!