It is currently Thu Apr 25, 2024 8:14 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 53 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #21 Posted: Thu Jan 09, 2020 5:05 pm 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
Now we'll analyse the position two moves earlier (this was an accident on my part, but produced interesting results!)

Click Here To Show Diagram Code
[go]$$Bc Black can capture the O17 stones
$$ ---------------------------------------
$$ | . O . . . . . . b c . . . . . . . . . |
$$ | . X O . . . . a X O O O . . 1 O O O . |
$$ | . O X O O . . d . X X X O O . X O . O |
$$ | . . X X O . O . . , . O X X X X X O . |
$$ | . . . X X O . . X O . X O . . . X . . |
$$ | . . . X O . O . X X X . X . . . X . . |[/go]


Here, white can live by exchanging a-d (not immediately, but a few moves in to the variation). But this does some damage to the top left, so it's an unclear position: perhaps white does better to tenuki immediately after :b1: (actually it's not clear-cut even after a few thousand playouts).

How does LZ assess this? I'm using network 258 again.

You know how sometimes Lizzie comes out in a rash, all covered in red spots?

Attachment:
rash.jpg
rash.jpg [ 78.3 KiB | Viewed 7110 times ]


This is the result of the "first-play urgency" (fpu) term that I mentioned earlier. Here's the fine print for this position:

Attachment:
trace_two_moves_earlier.csv [51.71 KiB]
Downloaded 313 times



So what exactly is happening here? The network's first assessment of P18 is that the position is good for white, 76% winrate. So this is the fpu value when looking for white's reply: white O19 gets a first-guess value of 76%. But once O19 is actually put on the board, its winrate estimate drops to 58% (you'll see 42% in the CSV file, this is from black's perspective). Meanwhile, having explored one node, the fpu_eval drops to 58%, but this is still a pretty big number for fpu.

On playout number 3, white O19's value is recalculated to 68%, but H18, which was previously the second choice, now gets 58% for FPU plus 12& for exploration bonus (puct), putting it in the lead.

Now that we've explored two different nodes, FPU drops to 52%, and both H18 and O19 are looking better than that, so the next 20 playouts stay with those two white moves. But by playout 22, LZ has found good black replies to those moves, so their evaluations have dropped below the fpu threshold, and it's time to try a new move. On playout 22, J17 scores 57% with FPU plus puct bonus and becomes the top choice. This looks promising for four playouts, but then the value drops, and again nothing is looking better than the FPU value, so LZ tries yet another new move at playout 27. And so on.

What's happening is that the FPU sets a bar for what a reasonable move should look like. In most normal positions, the high-policy moves (the first ones to be explored) will easily clear that bar, so LZ will build a narrow and deep tree. But if the high-policy moves don't live up to their initial promise, then LZ will widen the search until it either finds a better alternative or has tried everything on the board.


Attachments:
trace_two_moves_earlier.sgf [21.08 KiB]
Downloaded 604 times
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #22 Posted: Thu Jan 09, 2020 9:19 pm 
Dies with sente

Posts: 113
Liked others: 11
Was liked: 27
Rank: 1d
Universal go server handle: iopq
Bill Spight wrote:
jlt wrote:
Bill Spight wrote:
The only bridge book I gave my sister, a social player, was one by the great bridge author, Victor Mollo, in which, for amateurs, he advocates seeing, not the methodical calculation of variations. As readers here may be aware, I advocate seeing for go. I believe that Kotov's method is great for training, but in actual play I think you have to allow your unconscious to work, as well.


If I understand correctly, during actual play you let sequences pop up to your mind without trying to read "all" variations. But in order to get an efficient neural net (=brain), you need to train it by studying go problems, pro games, etc. During that training process, you think that the methodical calculation of variations is "great", but do you consider it as necessary?


It is important not to dither. It is important to supplement the human tendency to depth first search and shore up its weaknesses. Kotov's method does that. As for the best way to train the human brain, quien sabe? I don't think that we know the best way to train neural networks.

I have touched on these questions over the years. See the SL page about the fudge factor, for instance. https://senseis.xmp.net/?GoProblemsTheFudgeFactor

I look forward to seeing what xela is finding out about LZ's searches. :)


In that page I saw the bulky five + hane shape so I immediately made the bulky five placement, then extend after block then bent four in the corner.

I didn't have to read the internal sequences other than the exact one you read because they are of no interest. Black doesn't have room to live even if he's solidly connected on the outside, no?

Maybe we need to train people by giving them 1 second to solve each problem until they can't intuit the solution anymore. Then show the solution and move on to the next problem.

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #23 Posted: Fri Jan 10, 2020 4:47 am 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
On the topic of not reading all the variation, but relying on intuition (or "seeing"):
jlt wrote:
Bill Spight wrote:
...I advocate seeing for go. I believe that Kotov's method is great for training, but in actual play I think you have to allow your unconscious to work, as well.

If I understand correctly, during actual play you let sequences pop up to your mind without trying to read "all" variations.

RobertJasiek wrote:
Such might be ok for strategic planning but for tactical reading it is a total failure.

In mathematics, I've seen "magic" emerge from pro-level intuition. It's not unusual for a mathematician to publish a theorem, and then for someone to notice a gap in the proof. This is analagous to saying you've seen the key move that solves the tsumego for black, but you forgot to analyse one of white's responses. There are two ways it can pan out. The gap turns out to be an error: the theorem actually is false and needs to be withdrawn. (All your stones get captured.) Or the gap turns out to be a "mere oversight": after some work (in some cases, several years and many pages of publications later), the theorem is verified. (There's the moment of panic when your opponent plays the weird move you didn't think of, but then you're able to find the right reply.)

When graduate students leave these sorts of gaps, both scenarios are likely, and it's happened that people have seen their thesis crumble just before the finish line. But in the case of world-class mathematicians, it's pretty much always the case that their initial intuition was right, and the theorem holds up to scrutiny. This happened dramatically with Wiles's proof of Fermat's last theorem, and also a few times during the classification of finite simple groups. And those are just recent examples. It's been working this way for a long time, as documented in Poincaré's famous book of more than a century ago.

In general, mathematical research progresses by intuitive leaps, which are filled in with calculations after the fact. And then the general public gets the false impression that mathematics is logical and systematic. The leaders in the field aren't the ones who are better at calculation (although they are certainly very very good at calculation), they're the ones who can "see" further than others.

So I'm quite prepared to believe that good go players, in a game situation, will frequently "just see" the right move without reading out all variations. (And if they're asked to explain their choice afterwards, their method of explaining will be to put down variations that weren't part of their conscious thought process during the game.)


This post by xela was liked by 2 people: Harleqin, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #24 Posted: Fri Jan 10, 2020 4:50 am 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
Kirby wrote:
breadth first seems harder to do for humans. I like the idea of identifying some candidate plays, especially at the root of the tree. but after that, it's confusing....Thinking in terms of breadth is important, because you don't want to miss moves. but it's pretty darn hard to keep the entire game tree in your head before you can start eliminating options from search.

In chess there's something called the killer heuristic. Many chess players won't know it by name if they're not interested in computer chess, but will subconsciously apply it anyway. Roughly speaking, it says that a good move in one variation is likely to be a good move in other variations. So it allows you to prune the tree very effectively. Even for breadth first search, you can throw out many of the candidate moves instantly. I suspect this heuristic is much more effective in chess than in go, making breadth-first search relatively easier in chess.

(Caveat: I'm an awful chess player, so don't trust me on this one.)

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #25 Posted: Fri Jan 10, 2020 4:52 am 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
Slight digression: I suspect that Bill (and maybe others) would be interested in the long-standing debate between "symbolic" and "connectionist" approaches to AI. I'm thinking systematic reading=symbolic and "seeing"=connectionist. See https://neurovenge.antonomase.fr/


This post by xela was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #26 Posted: Fri Jan 10, 2020 5:33 am 
Judan

Posts: 6160
Liked others: 0
Was liked: 789
xela wrote:
good go players, in a game situation, will frequently "just see" the right move without reading out all variations.


If first we guess, then we also verify or refute.

(If mathematicians first state a conjecture, then they verify or refute it.)

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #27 Posted: Fri Jan 10, 2020 5:34 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
xela wrote:
Kirby wrote:
breadth first seems harder to do for humans. I like the idea of identifying some candidate plays, especially at the root of the tree. but after that, it's confusing....Thinking in terms of breadth is important, because you don't want to miss moves. but it's pretty darn hard to keep the entire game tree in your head before you can start eliminating options from search.

In chess there's something called the killer heuristic. Many chess players won't know it by name if they're not interested in computer chess, but will subconsciously apply it anyway. Roughly speaking, it says that a good move in one variation is likely to be a good move in other variations. So it allows you to prune the tree very effectively. Even for breadth first search, you can throw out many of the candidate moves instantly. I suspect this heuristic is much more effective in chess than in go, making breadth-first search relatively easier in chess.

(Caveat: I'm an awful chess player, so don't trust me on this one.)


Since go is a long game with a number of battlefields, it is unusual for a single play to be a killer. Maybe that is so a few times in a game. However, it is true that a good play in one variation will often be a good play in other variations, as well. That is behind the All Moves as First heuristic for go programs in pre-MCTS days. And one reason for persistent winrate swings in go is that both players are ignoring a big point or region and playing somewhere else. For instance, in the opening there may be winrate swings when each player fails to play in the last open corner.

Edit: This is one reason that I recommend local deep reading in actual play, as you may find a good play that was not at first obvious. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.


Last edited by Bill Spight on Fri Jan 10, 2020 9:22 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #28 Posted: Fri Jan 10, 2020 8:47 am 
Gosei
User avatar

Posts: 1754
Liked others: 177
Was liked: 492
I looked at the order in which LZ258 explores the tree. I will ignore tenuki moves for simplicity and translate into human-understandable terms and diagrams.

The initial position is the following:

Click Here To Show Diagram Code
[go]$$Bc Initial position
$$ ----------------------------+
$$ . . . . . . . . . . . . . . |
$$ O . . X O O O . . 1 O O O . |
$$ . . . X X X X O O . X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


LZ examines a first variation

Click Here To Show Diagram Code
[go]$$Bc Variation 1.1: not much damage
$$ ----------------------------+
$$ . . . . . . . . 2 6 . . . . |
$$ O . . X O O O 4 3 1 O O O . |
$$ . . . X X X X O O 5 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Let's try :b3: at :b5:

Click Here To Show Diagram Code
[go]$$Bc Variation 1.2: not much damage either
$$ ----------------------------+
$$ . . . . . . . . 2 4 . . . . |
$$ O . . X O O O . . 1 O O O . |
$$ . . . X X X X O O 3 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Let's try :b3: at :w4:

Click Here To Show Diagram Code
[go]$$Bc Variation 1.3.1. White is split but alive on both sides
$$ ----------------------------+
$$ . . . . . . . 6 2 3 . . . . |
$$ O . . X O O O . 4 1 O O O . |
$$ . . . X X X X O O 5 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


That's better than variations 1.1 and 1.2 so we can discard them and explore further the subtree starting at :w4: to see if we can do even better. The only other reasonable response is :b5: at N19 so we have to check if N19 is better than P17. Let's read a few moves deep :


Click Here To Show Diagram Code
[go]$$Bcm5 Variation 1.3.2.1.1. White is split but alive on both sides
$$ ----------------------------+
$$ . . . . 2 . 6 1 O X . . . . |
$$ O 5 . X O O O 4 O X O O O . |
$$ . . . X X X X O O 3 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


That's even better.
Let's try :b7: at :w10:.

Click Here To Show Diagram Code
[go]$$Bcm5 Variation 1.3.2.1.2.1. White is dead, game over.
$$ ----------------------------+
$$ . . . 4 2 . 3 1 O X . . . . |
$$ O . . X O O O 6 O X O O O . |
$$ . . . X X X X O O 5 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


(Of course White won't play :w10: there, LZ prefers to tenuki. Saving 5 stones in gote is not interesting right now.)

But perhaps we could try :b9: at H18?

Click Here To Show Diagram Code
[go]$$Bcm5 Variation 1.3.2.1.2.2.1 Very good for Black
$$ ----------------------------+
$$ . . 6 4 2 . 3 1 O X . . . . |
$$ O . 5 X O O O . O X O O O . |
$$ . . . X X X X O O 7 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


This doesn't prevent escaping but is very good for Black.

Click Here To Show Diagram Code
[go]$$Bcm5 Variation 1.3.2.1.2.2.2 Roughly the same
$$ ----------------------------+
$$ . 7 6 4 2 . 3 1 O X . . . . |
$$ O 8 5 X O O O . O X O O O . |
$$ . . . X X X X O O 9 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Well, it seems that with :b7: we can split White and in addition capture a few stones. But we didn't check other possibilities for :w6:.

Click Here To Show Diagram Code
[go]$$Bcm5 Variation 1.3.2.2
$$ ----------------------------+
$$ . . . . . . . 1 O X . . . . |
$$ O . . X O O O 2 O X O O O . |
$$ . . . X X X X O O 3 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


This looks so bad for White, so let's come back to variation 1.3.2.1. We said that :b7: at M19 is good but can we do better, for instance with :b7: at G18?

Click Here To Show Diagram Code
[go]$$Bcm7 Variation 1.3.2.1.3.
$$ ----------------------------+
$$ . . . . O . . X O X . . . . |
$$ O 1 . X O O O . O X O O O . |
$$ . . . X X X X O O . X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


This looks better for White, so let's come back to variation 1.3.2. with :w6: at J19.

Click Here To Show Diagram Code
[go]$$Wcm6 Variation 1.3.2.3.
$$ ----------------------------+
$$ . . . 1 . . . X O X . . . . |
$$ O . . X O O O . O X O O O . |
$$ . . . X X X X O O . X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Looks good for Black but we'll figure that out later. Let's continue first variation 1.3.2.1.3.


Click Here To Show Diagram Code
[go]$$Bcm7 Variation 1.3.2.1.3., continued
$$ ----------------------------+
$$ . . . . O . 2 X O X . . . . |
$$ O 1 . X O O O . O X O O O . |
$$ . . . X X X X O O . X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Still looks good but not optimal for Black. Let's finish first variation 1.3.2.2.

Click Here To Show Diagram Code
[go]$$Bcm5 Variation 1.3.2.2, continued:
$$ ----------------------------+
$$ . . . . 4 . . 1 O X . . . . |
$$ O . . X O O O 2 O X O O O . |
$$ . . . X X X X O O 3 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


White is dead, so variation 1.3.2.2. can be discarded, White won't play :w6: there.

Let's come back to variation 1.3.2.1.3 and read it until the end.

Click Here To Show Diagram Code
[go]$$Bcm7 Variation 1.3.2.1.3., continued
$$ ----------------------------+
$$ . . . . O . 2 X O X . . . . |
$$ O 1 . X O O O 4 O X O O O . |
$$ . . . X X X X O O 3 X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


OK, we've seen that position already, it's good for Black but not the best variation, so we can forget that :b7:. But let's come back to variation 1.3.2.3.

Click Here To Show Diagram Code
[go]$$Wcm6 Variation 1.3.2.3., continued
$$ ----------------------------+
$$ . . . 1 . . . X O X . . . . |
$$ O . 2 X O O O . O X O O O . |
$$ . . . X X X X O O . X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Very good for Black, White should tenuki and give up his stones for the moment.

OK, so now that we have done 98 playouts, why not try some silly move for :w2:, just to check if we overlooked something?

Click Here To Show Diagram Code
[go]$$Wcm2 Other choice for :w2:
$$ ----------------------------+
$$ . . . . . . . . 2 . . . . . |
$$ O . . X O O O . 1 X O O O . |
$$ . . . X X X X O O . X O . O |
$$ . O . . , . O X X X X X O . |
$$ O . . X O . X O . . . X . . |[/go]


Oh no, that's really silly, White is dead, forget it.

Conclusion: it seems that the exploration is mostly depth first, from left to right, but LZ sometimes comes back to an earlier branch and reads a few moves further from there.

Here is the tree. Black edges represent a sequence starting with Black, green edges a sequence starting with White.

A blue vertex is a very good position for Black, a black vertex is slightly good for black, and a red vertex is bad for Black.

Attachment:
tree.png
tree.png [ 21.45 KiB | Viewed 7030 times ]


This post by jlt was liked by 3 people: Bill Spight, Waylon, xela
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #29 Posted: Fri Jan 10, 2020 2:32 pm 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
Thanks jlt, nice pictures!

Bill Spight wrote:
Hmmm. I wonder about this position.

Click Here To Show Diagram Code
[go]$$Bc Black kills
$$ ---------------------------------------
$$ | . O . . . . . . . . . . . . . . . . . |
$$ | . X O . . O W B X O O O . . . O O O . |
$$ | . O X O O . . . X X X X O O . X O . O |
$$ | . . X X O . O . . , . O X X X X X O . |
$$ | . . . X X O . . X O . X O . . . X . . |
$$ | . . . X O . O . X X X . X . . . X . . |[/go]


In this position Black can kill. I know that the margin of victory does not matter, but surely the possibility of a clean kill will affect the winrate estimate, no? Does that then affect the search?


Yes, it does have an effect. I'll post the raw data and you can form your own impressions :-)

First, analysing with network number 157.

Attachment:
davies_tesuji_bill_variation-trace157.csv [92.91 KiB]
Downloaded 310 times




Attachments:
davies_tesuji_bill_variation-trace157.sgf [31.29 KiB]
Downloaded 514 times


Last edited by xela on Fri Jan 10, 2020 2:35 pm, edited 1 time in total.

This post by xela was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #30 Posted: Fri Jan 10, 2020 2:34 pm 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
And network 258 (separate post because of "three attachments per post" limit, am I the only person who finds this really annoying?)



Attachments:
davies_tesuji_bill_variation-trace258.sgf [23.69 KiB]
Downloaded 533 times
davies_tesuji_bill_variation-trace258.csv [62.45 KiB]
Downloaded 316 times
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #31 Posted: Sat Jan 11, 2020 8:51 am 
Dies with sente

Posts: 113
Liked others: 11
Was liked: 27
Rank: 1d
Universal go server handle: iopq
Bill Spight wrote:
xela wrote:
Kirby wrote:
breadth first seems harder to do for humans. I like the idea of identifying some candidate plays, especially at the root of the tree. but after that, it's confusing....Thinking in terms of breadth is important, because you don't want to miss moves. but it's pretty darn hard to keep the entire game tree in your head before you can start eliminating options from search.

In chess there's something called the killer heuristic. Many chess players won't know it by name if they're not interested in computer chess, but will subconsciously apply it anyway. Roughly speaking, it says that a good move in one variation is likely to be a good move in other variations. So it allows you to prune the tree very effectively. Even for breadth first search, you can throw out many of the candidate moves instantly. I suspect this heuristic is much more effective in chess than in go, making breadth-first search relatively easier in chess.

(Caveat: I'm an awful chess player, so don't trust me on this one.)


Since go is a long game with a number of battlefields, it is unusual for a single play to be a killer.

Maybe in your games. When reviewing my games with LZ I see that a single play is a "killer move" every move for me AND my opponent where it literally swings the game and we don't take it for many moves. Like a cut that we both didn't see or something. It's good in every variation. We could stop playing locally at any point and just do it to start a bigger fight that has a higher temperature, so whatever we're doing is not important

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #32 Posted: Sat Jan 11, 2020 9:10 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
iopq wrote:
Bill Spight wrote:
xela wrote:
In chess there's something called the killer heuristic. Many chess players won't know it by name if they're not interested in computer chess, but will subconsciously apply it anyway. Roughly speaking, it says that a good move in one variation is likely to be a good move in other variations. So it allows you to prune the tree very effectively. Even for breadth first search, you can throw out many of the candidate moves instantly. I suspect this heuristic is much more effective in chess than in go, making breadth-first search relatively easier in chess.

(Caveat: I'm an awful chess player, so don't trust me on this one.)


Since go is a long game with a number of battlefields, it is unusual for a single play to be a killer.

Maybe in your games. When reviewing my games with LZ I see that a single play is a "killer move" every move for me AND my opponent where it literally swings the game and we don't take it for many moves. Like a cut that we both didn't see or something. It's good in every variation. We could stop playing locally at any point and just do it to start a bigger fight that has a higher temperature, so whatever we're doing is not important


If you had quoted my whole, short paragraph, you would have seen that I reported that phenomenon and the reason for it.

Bill Spight wrote:
Since go is a long game with a number of battlefields, it is unusual for a single play to be a killer. Maybe that is so a few times in a game. However, it is true that a good play in one variation will often be a good play in other variations, as well. That is behind the All Moves as First heuristic for go programs in pre-MCTS days. And one reason for persistent winrate swings in go is that both players are ignoring a big point or region and playing somewhere else. For instance, in the opening there may be winrate swings when each player fails to play in the last open corner.


What you are talking about is what I meant by the italicized sentence. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #33 Posted: Wed Jan 15, 2020 5:14 am 
Oza
User avatar

Posts: 2401
Location: Tokyo, Japan
Liked others: 2340
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
In post #21 above xela wrote,
xela wrote:
Now we'll analyse the position two moves earlier (this was an accident on my part, but produced interesting results!)

Click Here To Show Diagram Code
[go]$$Bc Black can capture the O17 stones
$$ ---------------------------------------
$$ | . O . . . . . . b c . . . . . . . . . |
$$ | . X O . . . . a X O O O . . 1 O O O . |
$$ | . O X O O . . d . X X X O O . X O . O |
$$ | . . X X O . O . . , . O X X X X X O . |
$$ | . . . X X O . . X O . X O . . . X . . |
$$ | . . . X O . O . X X X . X . . . X . . |[/go]


Here, white can live by exchanging a-d (not immediately, but a few moves in to the variation). But this does some damage to the top left, so it's an unclear position: perhaps white does better to tenuki immediately after :b1: (actually it's not clear-cut even after a few thousand playouts).

How does LZ assess this? I'm using network 258 again.

...


I have some difficulty following what this whole thread is saying. The problem is that this (and most of the other recent discussions using bot results seem to assume that the bots produce "a" result that we can use as an assessment of the position under discussion. I do not find that to be the case if we run the analysis more than once.

What I did here as an illustration was to start Lizzie with the LZ258 net, open xela's 258 analysis sgf file, go to the position indicated above with pondering off, make the move at P18, and then quickly turn pondering on and off to get 340-400 playouts as indicated at the top of the screen in Lizzie. I did it manually since I did not know how to do it more precisely. Why 340-400? That seemed comparable to the screenshot that xela has in post #21 above.

Below (3 attachments) and in the following post (3 more) are screenshots of the results of repeating the procedure 6 times in a row, shutting down Lizzie between each run. The allocation of the search and the resulting evaluation is significantly different between the runs.
Run #1
Attachment:
Run 1 cropped 900.jpg
Run 1 cropped 900.jpg [ 184.31 KiB | Viewed 6934 times ]

Run #2
Attachment:
Run 2 cropped 900.jpg
Run 2 cropped 900.jpg [ 186.96 KiB | Viewed 6934 times ]

Run #3
Attachment:
Run 3 cropped 900.jpg
Run 3 cropped 900.jpg [ 183 KiB | Viewed 6934 times ]

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21


This post by ez4u was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #34 Posted: Wed Jan 15, 2020 5:17 am 
Oza
User avatar

Posts: 2401
Location: Tokyo, Japan
Liked others: 2340
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
Here are my additional pictures.
Run #4
Attachment:
Run 4 cropped 900.jpg
Run 4 cropped 900.jpg [ 171.4 KiB | Viewed 6933 times ]

Run #5
Attachment:
Run 5 cropped 900.jpg
Run 5 cropped 900.jpg [ 185.31 KiB | Viewed 6933 times ]

Run #6
Attachment:
Run 6 cropped 900.jpg
Run 6 cropped 900.jpg [ 171.83 KiB | Viewed 6933 times ]

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #35 Posted: Wed Jan 15, 2020 2:42 pm 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
ez4u wrote:
I have some difficulty following what this whole thread is saying. The problem is that this (and most of the other recent discussions using bot results seem to assume that the bots produce "a" result that we can use as an assessment of the position under discussion. I do not find that to be the case if we run the analysis more than once.

Thanks, this is an important point that's got swamped in the discussion. Two points really.

First, I started this thread because I'm interested in the process, not the results. Can we peer inside Leela Zero's brain and learn more about what's happening? How exactly does LZ choose which moves to explore? Why does it sometimes give up on one move, switch to another for a while, then switch back? How is it deciding when to switch and when to stay with its first choice? Why does it sometimes ignore things that humans find promising? And so on.

Everything is represented inside the software as numbers. Can a close look at the numbers help us learn what's happening? (Not to improve our go playing, just to understand more about this fascinating new technology.) So I've been posting CSV files and SGF files. The CSVs contain all the numbers: these are the fundamental things I want to study. (Well, to be really fundamental, what I'm studying is the LZ source code. The CSVs are a way to test and refine my understanding.) And then the SGFs are an aid to navigation, a way to quickly flip through the variations and decide which lines of the CSV to look at.

But other people around here seem to be more results-oriented than me. When I checked a couple of days ago, noone had opened up any of the CSVs. Literally not one download. (Recently, two of the files have been downloaded twice each.) That's OK, if people find the SGFs interesting in their own right then I'm happy to go with the flow. But you're right, it's a change of focus.

Secondly, why are you getting different results on each run? And should you be worried?

There's a couple of factors:
  • LZ's neural nets are not symmetric. If you flip or rotate the board, you'll get slightly different evaluations. (There are eight possible symmetries.) Each time LZ evaluates the neural net, it will randomly choose one of the symmetries. A good way to see this is to use the "show policy" button in Lizzie. When you restart, you might get different policy values. For clear-cut positions, they should always be in the same ballpark. (For example, if one move has policy over 50%, and all other moves are below 5%, the exact numbers might move around, but the high policy move should always be the same move.) The network evaluations of winrates will also change, but Lizzie won't show you those numbers.
  • If you run with two or more threads, then the threads race each other. If thread 1 happens to go a bit faster than thread 2 today, and then tomorrow thread 2 is the faster one, they'll explore and evaluate the nodes in a different order. Which one is fastest will vary according to other things happening within your computer's operating system.
In principle (and I haven't tested this thoroughly), results will be perfectly reproducible if you set a random number seed (using the --seed command line option for LZ) and use only one thread. You can do this within Lizzie's config, but to make things reproducible you'll also need to restart LZ (reload the engine within Lizzie) every time you start analysis, to make sure you're starting again with the chosen random number seed. If you want really fine control over what LZ is doing, you need to learn to drive it from the command line by typing in GTP commands. (Tip: the GTP command "heatmap" will print out the policy network for the whole board.)

For any Monte Carlo method, there's some mathematical theory around the idea of convergence. That is, the first few iterations will give varying results depending on the seed, but after enough (a few thousand?) playouts, you should end up with the same answer regardless of where you started. For "pure" MCTS in go (i.e. not using a neural net, just using random rollouts to the end of the game), I think the theory is fairly solid in this respect. When you factor in the network being asymmetric, I don't think anyone's done the maths rigorously, we're just relying on intuition and experience to say the results will be sensible.


This post by xela was liked by 3 people: Bill Spight, dfan, ez4u
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #36 Posted: Thu Jan 16, 2020 1:57 am 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
To give you an idea of how much difference the asymmetry makes, here are the policy values and network winrates for this position:
Click Here To Show Diagram Code
[go]$$Bc Black can capture the O17 stones
$$ ---------------------------------------
$$ | . O . . . . . . . . . . a b . . . . . |
$$ | . X O . . . . c X O O O d e 1 O O O . |
$$ | . O X O O . . . f X X X O O . X O . O |
$$ | . . X X O . O . . , . O X X X X X O . |
$$ | . . . X X O . . X O . X O . . . X . . |
$$ | . . . X O . O . X X X . X . . . X . . |[/go]

Only moves a-f get policy values above 1% in any symmetry. (Sorry about the big vertical space, I don't know why this forum does that with tables.) You can see that regardless of the random number seeds, LZ will be looking at b, c and f in some order. The winrates vary a little bit but it's reasonable to expect they'll end up in the same place after a thousand playouts' worth of adjustment.





























movesymmetry 1symmetry 2symmetry 3symmetry 4symmetry 5symmetry 6symmetry 7symmetry 8
N190.5% 0.3% 0.6% 0.5% 2.1% 0.5% 0.4% 0.3%
O1920.1% 53.2% 31.9% 20.4% 19.3% 24.3% 21.3% 23.1%
H1856.0% 32.7% 46.7% 51.5% 41.8% 48.5% 64.6% 55.0%
N181.8% 0.5% 0.7% 2.7% 2.1% 2.2% 0.3% 1.3%
O182.3% 1.3% 0.9% 0.8% 3.0% 1.8% 2.0% 1.8%
J1710.1% 2.4% 11.5% 15.2% 23.4% 14.8% 1.7% 9.8%
winrate70.1% 75.9% 79.1% 68.6% 80.3% 78.1% 68.3% 62.4%


I would guess that your run number 3 is symmetry 2 in my table, and run numbers 4 and 6 are symmetry 7. The others are a bit harder to pick. (And why does P19 sometimes appear in your screen shots despite its policy value being lower than the moves I've shown? Because for some random number seeds, J17 gets a winrate estimate in the 20s so it gets put aside quickly, whereas P19 gets a winrate estimate in the high 40s so it's a more promising candidate for investigation.)


This post by xela was liked by 2 people: dfan, ez4u
Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #37 Posted: Thu Jan 16, 2020 1:58 am 
Lives in gote

Posts: 586
Location: Adelaide, South Australia
Liked others: 208
Was liked: 265
Rank: Australian 2 dan
GD Posts: 200
Oh, and the "show policy" button in Lizzie seems to show different policy values from the ones that LZ will display on the command line using the "heatmap all" GTP command. I haven't got to the bottom of that one yet. The numbers are close, but not identical.

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #38 Posted: Thu Jan 16, 2020 3:59 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
I suppose that the first symmetry is a reflection in a line. But which line? The 10th file, the 10th rank, the diagonal from A-01 to T-19, or the diagonal from A-19 to T-01? :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #39 Posted: Thu Jan 16, 2020 4:37 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
xela wrote:
LZ's neural nets are not symmetric.


Why not? I can guess, but I'd rather not.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Exploring LZ's search algorithm: worked examples
Post #40 Posted: Thu Jan 16, 2020 6:27 am 
Gosei

Posts: 1590
Liked others: 886
Was liked: 528
Rank: AGA 3k Fox 3d
GD Posts: 61
KGS: dfan
Bill Spight wrote:
xela wrote:
LZ's neural nets are not symmetric.

Why not? I can guess, but I'd rather not.

How could they be? Can you define the mathematical constraint you'd like to put on the convolutional kernels?

One thing you could theoretically do to handle symmetries is to canonicalize every input position to the network. Create a total ordering on board positions (e.g., sort positions by the state of the A1 point, then the A2 point if the two positions are the same at A1, then the A3 point, etc.). Then when you want to evaluate a position, take the eight transformations of it (including the identity), select the one that results in the "smallest" position, apply that transformation, send it through the network, and then apply the inverse transformation to the policy. That sounds like kind of a waste of energy, though.

Similarly, you could define the output of the policy network to be the average of all eight policies on the transformed positions, but now you're effectively doing eight times as much work to evaluate each position.


This post by dfan was liked by: Bill Spight
Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 53 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group