Page 3 of 3

Re: Why isn't playing a move you want a bot to analyze obvio

Posted: Thu Oct 10, 2019 3:40 am
by iopq
Bill Spight wrote:But there is no guarantee that, as bots get stronger, 7.5 komi will continue to favor White.
It might seem like a reasonable thing to say, but it's not.

I've trained a 9x9 bot and the white advantage with 7.5 on the first move only grows as the bot gets stronger. I can say fairly certainly that on 9x9 boards the correct komi is 7. We are much less certain about the correct komi on 19x19, but it's far more likely to be 7 than any other value (assuming Chinese counting). The real winrates of white are going up in Leela Zero as White is more able to keep the point advantage. Similarly, with 5.5 komi (if that was the one used in self-play) Black would have a growing winrate.

KataGo has a close to 50% win rate using 7 komi.

There's a lot of evidence pointing to 7 komi on 19x19 and none that may suggest otherwise. There's no guarantee the correct komi is 7, but that's the same as saying there is no guarantee the sun will shine tomorrow. Where is the proof otherwise?

Re: Why isn't playing a move you want a bot to analyze obvio

Posted: Thu Oct 10, 2019 4:43 am
by xela
EricBackus wrote: Does it really help to analyze every legal top-level move but not every legal response to those moves? You clearly don't want to force analysis of every response to every move at every level of the tree, because then you're just doing a brute-force search and you don't have anywhere close to the computing power you would need for that. But with only 10 playouts for each legal move, do you really get any kind of accurate estimate of the value of those moves?
My reasoning is that you don't need an accurate estimate, as the search works off error bounds rather than exact winrates. Consider:
  • Move A: the one that the bot would pick as the best move with a winrate of 52%
  • Move B: the "pro move" that the bot ignored, which if it were analysed would get a winrate of 51.5%
  • Move C: pretty hopeless
On ten visits for every move, you might find:
  • Move A: winrate looks like it's somewhere between 30% and 70%
  • Move B: winrate looks like it's between 28% and 71%
  • Move C: winrate looks like it's between 2% and 40%
Then moves A and B will both get a few extra playouts to refine the winrate estimate, but C won't. You end up being able to see an estimate for move B without the trouble of explicitly asking for it.
EricBackus wrote:But, of course, it is worth trying anyway just to see what happens.
Done (for LZ). I feel like the apocryphal plumber: the code changes only took a few minutes, but it was more than an hour's work to figure out that I needed to change the UCTNode::uct_select_child function. This is a dodgy first cut: the number 10 is hard-coded. If I get some more spare time, I'll add a command line option to change the number of playouts. Of course this is all open source, so no reason why someone else couldn't pick it up and run with it.

Initial results look promising but not spectacular. Using LZ network 242, in the medium term (5,000-10,000 playouts) you see a wider range of moves, but then it converges on to what the unmodified LZ-242 would have done anyway. Using the "bubblesId" network packaged with Lizzie 0.7, there seems to be more variety. Over the weekend I'll play with some different positions and different networks, and post some examples here unless I get distracted by the next shiny thing.

Re: Why isn't playing a move you want a bot to analyze obvio

Posted: Thu Oct 10, 2019 6:18 am
by Bill Spight
xela wrote:My reasoning is that you don't need an accurate estimate, as the search works off error bounds rather than exact winrates. Consider:
  • Move A: the one that the bot would pick as the best move with a winrate of 52%
  • Move B: the "pro move" that the bot ignored, which if it were analysed would get a winrate of 51.5%
  • Move C: pretty hopeless
On ten visits for every move, you might find:
  • Move A: winrate looks like it's somewhere between 30% and 70%
  • Move B: winrate looks like it's between 28% and 71%
  • Move C: winrate looks like it's between 2% and 40%
Then moves A and B will both get a few extra playouts to refine the winrate estimate, but C won't. You end up being able to see an estimate for move B without the trouble of explicitly asking for it.
EricBackus wrote:But, of course, it is worth trying anyway just to see what happens.
Done (for LZ). I feel like the apocryphal plumber: the code changes only took a few minutes, but it was more than an hour's work to figure out that I needed to change the UCTNode::uct_select_child function. This is a dodgy first cut: the number 10 is hard-coded. If I get some more spare time, I'll add a command line option to change the number of playouts. Of course this is all open source, so no reason why someone else couldn't pick it up and run with it.

Initial results look promising but not spectacular. Using LZ network 242, in the medium term (5,000-10,000 playouts) you see a wider range of moves, but then it converges on to what the unmodified LZ-242 would have done anyway. Using the "bubblesId" network packaged with Lizzie 0.7, there seems to be more variety. Over the weekend I'll play with some different positions and different networks, and post some examples here unless I get distracted by the next shiny thing.
Analysis, particularly if the analyst focuses on only a few key moves instead of trying to analyze the whole game at once, can be too slow for actually playing a game. I would be willing, in certain cases, to spend 36k playouts on the first pass (100 on each of 360 moves) to discover playable moves, even possibly a "best" move that the bot would otherwise have overlooked.

Re: Why isn't playing a move you want a bot to analyze obvio

Posted: Fri Oct 11, 2019 6:07 am
by xela
Bill Spight wrote:Analysis, particularly if the analyst focuses on only a few key moves instead of trying to analyze the whole game at once, can be too slow for actually playing a game. I would be willing, in certain cases, to spend 36k playouts on the first pass (100 on each of 360 moves) to discover playable moves, even possibly a "best" move that the bot would otherwise have overlooked.
OK, I've modified my code so that you can use different numbers of playouts. And I've had a go at making a Windows binary so that other people can more easily play with this -- although I'm not sure if this will work, as I'm used to doing this sort of thing on linux instead. Download from here. During the next day or three I might start a new thread to post some screen shots of results so far.