Re: KataGo V1.3
Posted: Sun Mar 01, 2020 4:45 am
lightvector, could you please shortly explain how KataGo uses maxPlayouts and maxVisits?
(I think there is a bit of confusion here)
(I think there is a bit of confusion here)
Life in 19x19. Go, Weiqi, Baduk... Thats the life.
https://www.lifein19x19.com/
The reason you test with fixed amount of search instead of fixed amount of time is to make the test independent of external factors like hw speed or code optimizations, and focus on network strength. With fixed playouts you reintroduce some such further factors, to reward the side with better tree reuse, and randomize the amount of effective search for each position. Such wider test can also be useful, but may not be always appropriate.inbae wrote:IMHO, benchmarks should be done in playout parity, not in visit parity.
...
Playout parity, on the other hand, is more appropriate for measuring strength of engines, since number of playouts is proportional to time spent.
I think so too, and I doubt "visit" would necessarily mean tree reuse, and "playout" ignoring reuse. But LZ started to use them like this, so this is often implied (IIRC 1 playout = 1 actually performed simulation, 1 visit = 1 simulation whether from reuse or actually performed now).xela wrote:I think a lot of people tend to use "visits" and "playouts" interchangeably.
Visits usually refer to the visit count of root node, so this is less relevant.If there's a difference, my understanding is that "one playout" is one round of exploring from the root to a leaf node, and one playout adds one visit to every node along the way, so that one playout = multiple visits.
Xela, I think you've got this wrong. My understanding is playouts and visits (at least as the terms are used as "bot is configured at x visits/playouts per move") are both counting the same thing (one more leaf node in the tree of explored variations) but playouts are a delta per move, whilst visits are the total across tree reuse from previous moves. playouts <= visits. x playouts will increase visits by x, but visits can start at > 0 when playouts for that move is 0. Setting playouts = x means for each move add an extra x nodes to the tree and then play the best move, visits = x means keeping adding nodes to the tree (which could be non-empty if opponent played an expected move) until there are x and then play the best move. A worked example with playouts=4:xela wrote:I think a lot of people tend to use "visits" and "playouts" interchangeably. (The Lizzie interface doesn't help, showing "playouts" and "visits/second" where both are measuring the same thing.)
If there's a difference, my understanding is that "one playout" is one round of exploring from the root to a leaf node, and one playout adds one visit to every node along the way, so that one playout = multiple visits. <snip/>
Code: Select all
playout 1: B q4
Variation tree (visits = 1):
Empty board
/
B q4
playout 2: B d4
Variation tree (visits = 2):
Empty board
/ |
B q4 B d4
playout 3: B q4 W d16 ie add w d16 as move 2 to existing node in tree of 1 B q4.
Variation tree (visits = 3):
Empty board
/ |
B q4 B d4
|
W d16
playout 4: B q16
Variation tree (visits = 4):
Empty board
/ | \
B q4 B d4 B q16
|
W d16
Code: Select all
Initial position B d4 W q16
playout 1: B d16
Variation tree (visits = 1)
B d4 W q16
/
B d16
playout 2: W q4 after B d16
Variation tree (visits = 2)
B d4 W q16
/
B d16
|
W q4
playout 3: B d17
Variation tree (visits = 3)
B d4 W q16
/ |
B d16 B d17
|
W q4
playout 4: B r17 after B d16 W q4
Variation tree (visits = 4)
B d4 W q16
/ |
B d16 B d17
|
W q4
|
B r17
Code: Select all
Initial tree before any playouts (visits = 1):
B d4 W q16 B d16 W q4
/
B r17
Playout 1: B o17
Variation tree (visits = 2): NB we have 2 visits after 1 playout
B d4 W q16 B d16 W q4
/ |
B r17 B o17
Playout 2: W r16 after B r17
Variation tree (visits = 3)
B d4 W q16 B d16 W q4
/ |
B r17 B o17
|
W r16
Playout 3: B q17 after B r17 W r16
Variation tree (visits = 4)
B d4 W q16 B d16 W q4
/ |
B r17 B o17
|
W r16
|
B q17
Code: Select all
Playout 4: B r3
Variation tree (visits = 5)
B d4 W q16 B d16 W q4
/ | \
B r17 B o17 B r3
|
W r16
|
B q17
Historically, i.e., a few years agoxela wrote:I think a lot of people tend to use "visits" and "playouts" interchangeably. (The Lizzie interface doesn't help, showing "playouts" and "visits/second" where both are measuring the same thing.)
If there's a difference, my understanding is that "one playout" is one round of exploring from the root to a leaf node, and one playout adds one visit to every node along the way, so that one playout = multiple visits.
At the end, what about the node, B q4? Does it have 2 visits, because it has been visited twice, but only 1 playout, since only 1 playout has been made from it?Uberdude wrote:Xela, I think you've got this wrong. My understanding is playouts and visits (at least as the terms are used as "bot is configured at x visits/playouts per move") are both counting the same thing (one more leaf node in the tree of explored variations) but playouts are a delta per move, whilst visits are the total across tree reuse. playouts <= visits. x playouts will increase visits by x, but visits can start at > 0 when playouts for that move is 0. Setting playouts = x means for each move add an extra x nodes to the tree and then play the best move, visits = x means keeping adding nodes to the tree (which could be non-empty if opponent played an expected move) until there are x and then play the best move. A worked example with playouts=4:xela wrote:I think a lot of people tend to use "visits" and "playouts" interchangeably. (The Lizzie interface doesn't help, showing "playouts" and "visits/second" where both are measuring the same thing.)
If there's a difference, my understanding is that "one playout" is one round of exploring from the root to a leaf node, and one playout adds one visit to every node along the way, so that one playout = multiple visits. <snip/>
Move 1: Bot is black to play on empty boardAs playouts was set to 4 the bot stops exploring the tree, and picks the move with (probabilistic bias on) best averaged value from network (ie for B q4 it is an average of how good B q4 position is and B q4 W d16 position is), say it picks B d4.Code: Select all
playout 1: B q4 Variation tree (visits = 1): Empty board / B q4 playout 2: B d4 Variation tree (visits = 2): Empty board / | B q4 B d4 playout 3: B q4 W d16 ie add w d16 as move 2 to existing node in tree of 1 B q4. Variation tree (visits = 3): Empty board / | B q4 B d4 | W d16 playout 4: B q16 Variation tree (visits = 4): Empty board / | \ B q4 B d4 B q16 | W d16
Those were often called rollouts (ie. to the end) instead.Bill Spight wrote:Historically, i.e., a few years ago, in MCTS playouts were made, not from the root, but from an unexpanded node, in order to estimate its winrate.
The policy sharpness and therefore tree reuse as well are strongly bound to the nature of the NN. I have no idea why you consider them as external factors.jann wrote:The more things you test at the same time (ie. network strength plus policy sharpness / tree reuse intensity) the harder to measure those things independently (same as with hw and other external factors).
I think some confuse playouts with rollouts which are somehow a different thing.jann wrote:Those were often called rollouts (ie. to the end) instead.Bill Spight wrote:Historically, i.e., a few years ago, in MCTS playouts were made, not from the root, but from an unexpanded node, in order to estimate its winrate.
I wrote:inbae wrote:The policy sharpness and therefore tree reuse as well are strongly bound to the nature of the NN. I have no idea why you consider them as external factors.
For example, tree reuse may work quite differently for high-visit and low-visit scenarios (I'm not saying it necessarily will, but possible). Then test results that included tree reuse extent may become less relevant than narrower ones.This is no problem if you are sure that all those factors will work exactly the same way for later use as for the test
Like above, focus on less things and make results more robust and portable. But both narrower and wider tests have advantages and disadvantages (if you can test directly on target hw and conditions it's best to do just that, without synthetic limits).Limeztone wrote:The effect of limiting the search space instead is not so clear to me.
How?jann wrote:For example, tree reuse may work quite differently for high-visit and low-visit scenarios.
I'm not sure what you are meaning by "wide" and "narrow" here. And the search tree will be reused in fixed visits tests as well unless you somehow disable tree reuse explicitly.jann wrote:Then test results that included tree reuse extent may become less relevant than narrower ones.
As I understand visits vs playouts is that if you clear the tree for every move made, visits and playouts become the same.jann wrote:For example, if you clear the tree each move, fixed playout tests are heavily affected