Focused search

Bill Spight · #1

Has any research been done using focused search with AI? I assume so.

Edit: I mean go AI, and I mean focused on part of the go board. Thanks to Kirby for asking.

Kirby · #2

What's the definition of focused search? Like a heuristic for searching the game tree? When AlphaGo utilized a policy network in MCTS, would this be considered a type of "focused search"?

I'm thinking you're referring to something specific that I don't understand the definition of.

Bill Spight · #3

Kirby wrote:

What's the definition of focused search? Like a heuristic for searching the game tree? When AlphaGo utilized a policy network in MCTS, would this be considered a type of "focused search"?

I'm thinking you're referring to something specific that I don't understand the definition of.

I mean search that is focused on part of the go board. For instance, for the first move only search in the Right-Top-Right eighth of the board. Or set up positions from the Segoe-Go Seigen tesuji problems, which are half boards, and search those half boards. Or problems in the Guanzipu may be fairly sprawling, but still in a limited area. Also, search for many difference games would benefit from being focused.

"Research" is probably too strong a term.

xela · #4

Yes, but I don't think this is what you're looking for: http://lie.math.brocku.ca/GoTools/

You mean based on AlphaGo/LZ/KataGo-like architecture? It's on my list of interesting things that I may or may not get around to during the next ten years. I'm hoping someone else leaps in first :-)

Lizzie can already tell LZ to avoid/allow certain points in search. But the interface is clumsy: you can have to add intersections one at a time, no way to select a region. And it only affects the top level of the search tree.

I guess you're looking at a situation where, say, the top left corner contains a tsumego or joseki position, the rest of the board is blank, and the AI really wants to tenuki. The challenge is that if you restrict the search to the top left, you're likely to find that all the policy network values in that region are 0.1% or less. Once the policy has learned to identify the area as uninteresting (compared with the rest of the board), it doesn't train to discriminate between moves in that area. So, on current networks, if you restrict the search to that area, it will be doing something close to brute-force search. And we know how well that works in go.

I can think of four possible approaches.

Software to generate a plausible pattern of stones to fill in the rest of the board with settled positions, so that the region of interest is indeed the most interesting.
Train a new network for part-board positions. This is tricky because you don't have the game result to flag good/bad outcomes. I think it would take a lot of work to identify the right objective function (i.e. if you're not training the network to win a game, then what goal are you training it for?)
Think of different ways to interrogate existing networks. I'm thinking that the network does have something like a "sense of local shape" buried in the middle layers, but this gets suppressed by later layers for regions that aren't interesting in the whole-board context. So instead of just looking at the output layer of the network, is there a way to identify which bits of the whole network "light up" when it it looks at a position?
Mess with the network inputs. In theory, it can accept floating point numbers; in practice, it gets 0.0 for empty space and 0.1. for "there's a stone here". Instead of trying to arrange stones in plausible patterns, what happens if you just feed it 0.01 of a black stone and 0.01 of a white stone for every intersection outside your region of interest? Would it just crash, or would something interesting come out?

I have spent a little time exploring tsumego with KataGo. Sometimes, all you need is to put a stone on each 3-3 point outside the interesting corner, and then it will go ahead and look at the tsumego. Other times, I might need 50+ moves to create settled territory everywhere else on the board before it stops trying to tenuki away from the tsumego position. That in itself is a good lesson: it's surprising how often sacrificing the stones is preferred ahead of trying to save them.

The Go Seigen/Segoe positions are different in nature from the average tsumego: with half the board occupied, KataGo will often go straight to the solution even when the other two corners are empty. (But inputting those positions is a bit tedious, hence my question on image recognition software.)

Bill Spight · #5

xela wrote:

Yes, but I don't think this is what you're looking for: http://lie.math.brocku.ca/GoTools/

I have admired Thomas's work for a long time.

He and I are buddies since '08.

Quote:

You mean based on AlphaGo/LZ/KataGo-like architecture? It's on my list of interesting things that I may or may not get around to during the next ten years. I'm hoping someone else leaps in first :-)

Lizzie can already tell LZ to avoid/allow certain points in search. But the interface is clumsy: you can have to add intersections one at a time, no way to select a region. And it only affects the top level of the search tree.

Clumsiness aside, that would work for restricting the first move to an "eighth" of the board (actually 55 points).

A while back I asked about making one layer of the neural network symmetrical, and it seems like that is a bad idea. Since then it hit me that making the play asymmetrical instead, which you can do by restricting the first play, might be a good idea. Then we would probably want the trained neural network to be asymmetrical, eh?

Edit: Hmmm. If we want to teach the network to play in that part of the board for its first play, we should probably allow it to play elsewhere, but assign a loss if it does.

And, OC, in a real game if the bot is White and Black plays the first move elsewhere, we map the plays so that White thinks that Black made the first move in the assigned area.

Quote:

I guess you're looking at a situation where, say, the top left corner contains a tsumego or joseki position, the rest of the board is blank, and the AI really wants to tenuki. The challenge is that if you restrict the search to the top left, you're likely to find that all the policy network values in that region are 0.1% or less. Once the policy has learned to identify the area as uninteresting (compared with the rest of the board), it doesn't train to discriminate between moves in that area. So, on current networks, if you restrict the search to that area, it will be doing something close to brute-force search. And we know how well that works in go.

Does the policy or value network change during a single game? (A copy of the original would be used, OC.) If not, how is that different from what we have now? Anyway, my impression from looking at the Elf commentaries is that Elf likes corners, in particular 3-3 points, although not enough to play there first.

Quote:

I can think of four possible approaches.
{snip}
Train a new network for part-board positions. This is tricky because you don't have the game result to flag good/bad outcomes. I think it would take a lot of work to identify the right objective function (i.e. if you're not training the network to win a game, then what goal are you training it for?)

The right objective function is no problem for difference games. You are asking the first player to win the zero komi game, the second player to win or tie. (In effect giving the second player ½ pt. komi.) My guess is that the second player would tend to mirror the first player's moves, especially in irrelevant areas of the board. But who knows? If the human can flag the irrelevant areas, that would almost surely reduce the search.

Quote:

I have spent a little time exploring tsumego with KataGo. Sometimes, all you need is to put a stone on each 3-3 point outside the interesting corner, and then it will go ahead and look at the tsumego. Other times, I might need 50+ moves to create settled territory everywhere else on the board before it stops trying to tenuki away from the tsumego position. That in itself is a good lesson: it's surprising how often sacrificing the stones is preferred ahead of trying to save them.

Yes. As the guy who proposed the proverb, Tenuki is always an option, I feel that the bots have supported it.

But I still have some doubts. Sometimes the tenuki implies that the whole board temperature is greater than 15, with no obvious large scale fight. OC, that may well be the case, as your research on temperature suggests. But it may also be the case of a shared blind spot, resulting from self play training. Quien sabe?

Quote:

The Go Seigen/Segoe positions are different in nature from the average tsumego: with half the board occupied, KataGo will often go straight to the solution even when the other two corners are empty.

That's good to hear.

As for life and death positions, one thing I have tried with Deep Leela, without much success, is to set up a similar mirrored position in the diagonally opposite corner, which is settled. E. g., if Black is trying to kill White in the top right corner, White has killed Black in the bottom left corner. You need to set komi, though, so that Black wins by killing.

xela · #6

Bill Spight wrote:

Does the policy or value network change during a single game?

The network itself, i.e. the file of a gazillion numbers representing the network weights, doesn't change. But the policy value that the network assigns to, say, a play at C15, can change dramatically (*) according to the position of stones in other areas of the board, even when the region of interest hasn't changed.

(*) e.g. from 0.1% to 99%.

Bill Spight · #7

xela wrote:

Bill Spight wrote:

Does the policy or value network change during a single game?

The network itself, i.e. the file of a gazillion numbers representing the network weights, doesn't change. But the policy value that the network assigns to, say, a play at C15, can change dramatically (*) according to the position of stones in other areas of the board, even when the region of interest hasn't changed.

(*) e.g. from 0.1% to 99%.

As I thought.

Focused search

Who is online