reviewing SL articles using LZ and criticism

John Fairbairn · Post by **John Fairbairn** » Wed Dec 04, 2019 6:03 am

Andrew: I have upgraded LZ since you kindly gave advice earlier on, but I admit to not bothering too much about number of playouts. I'm working on the principle that policy moves are more likely to be based on principles, and principle are what - in principle! - we are trying to find. If a bot spends ages and comes up with a move that tactically overrides an earlier principle, that just seems to be the exception that proves the rule. We are all used to that sort of behaviour with go proverbs. I'm quite prepared to be shown I've got my head on backwards, of course.

As to acknowledging that bots make mistakes, I'm well aware of that for the simple reason that I have won quite a few games against them (ladders/tsumego blips), but again I haven't the patience or the machine to go for very high playouts.

lightvector · Post by **lightvector** » Wed Dec 04, 2019 6:59 am

John Fairbairn wrote:Andrew: I have upgraded LZ since you kindly gave advice earlier on, but I admit to not bothering too much about number of playouts. I'm working on the principle that policy moves are more likely to be based on principles, and principle are what - in principle! - we are trying to find. If a bot spends ages and comes up with a move that tactically overrides an earlier principle, that just seems to be the exception that proves the rule.

Based on my experience working with bots and MCTS, I think using this as a justification to put credence on low-playout search results is dangerous and you simply should not be doing so. It's the precisely other way around - more playouts will usually clarify any "principles" to be found, and less playouts will usually obscure them.

Bots learn things in a different way than humans do (as should be at least partly suggested by their very different style). And one result of this is that the bot's raw policy while being superhuman in some intuitive aspects is often uncertain and mistaken in ways that professional players would hardly ever be uncertain in or mistaken. Search is necessary to correct these flaws, and what you are seeing is exactly that.

So often, it's less that a bot spends ages and ages to come up with some clever tactic, and more a bot spends ages and ages and realizes its policy is being really stupid in several ways at all different points down this or that variation and affecting its judgment in ways that it really shouldn't.

Additionally, despite bot holistic judgment being "superhumanly" strong on average, the individual raw value net evaluations are often are fantastically noisy sometimes. The superhuman part emerges in the average, not in any single evaluation.

So often, it's less that a bot spends ages and ages to come up with some clever tactic, and more a bot spends ages and ages to accumulate enough playouts so that the noise averages out and you are actually left with a reasonable guide to the good moves.

Use more playouts.

lightvector · Post by **lightvector** » Wed Dec 04, 2019 7:13 am

Addendum: You can somewhat reduce the number of playouts needed by using a bot *interactively* though, to explore variations dynamically and see what it says rather than sitting and waiting for long enough for the bot to correct its own flaws (if you're on weaker hardware).

Uberdude's post above is excellent and is essentially what I would do. If you haven't already internalized the kind of way of using it that he walks through in that post, giving that post a second read and trying to apply the same kind of thing in your own usage could be worthwhile.

Knotwilg · Post by **Knotwilg** » Wed Dec 04, 2019 7:23 am

Let me give another example how I use bots for learning.

When I started playing, the "Catenaccio joseki" was very popular, as a response against the equally popular high pincer. Later we were told that it wasn't good for the player who had two stones on the second line, and the pattern went out of fashion.

I tested it in LZ/KataGo, early enough and in an opening where the play would make sense from both sides. First of all, we stay well below the 10% threshold, of course, even below 5%, or below 1,5 points as per KataGo. Secondly, it's not the moves which directly result in low stones that are considered "bad". I know this is not what pros/teachers said at the time but I could have thought that. The "bad moves" are the pincer and the jump:

Click Here To Show Diagram Code: [go]$$W Catenaccio joseki $$ --------------------------------------- $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . X . . . . | $$ | . . . O . . . . . , . . . . . , X . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . . . . . . . . . . . . . . 2 . 7 . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . 3 . 1 . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . O . . . . . , . . . 4 . X . 5 . | $$ | . . . . . . . . . . . . a b . . 6 . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ ---------------------------------------[/go]

When W1 approaches with the idea of breaking up Black's side framework, KataGo (like other bots) puts White at 55%, slightly more than 1 point ahead. The preferred answer for Black is to back off at ''a'' or ''b''.

Black's [pincer] B2 is slightly inferior. White's chances grow to 57% and with half a point. Next, White should invade the corner at B6. However, the [jump] of W3 is the big thing in this pattern evaluation. It loses more than a point and drops White back to 54%.

All the remaining moves don't change much about how KataGo evaluates this pattern. It likes the 3-3 invasion still better than the slide of W5, but marginally so. And it hardly objects against White's [slide] at W7, even if it's on the second line.

After the pattern, KataGo still finds White ahead by 53% and half a point.

My interpretation is that, given the fact that bots like early corner invasions, a pincer makes the corner invasion even more attractive and so loses points over extending the other side of the corner, and the jump makes it much less appealing to jump into the corner, because White has already commmitted to the approach stone, so it loses points over the direct invasion. Once that commitment has been made, sliding to make a [base] is not bad per se.

I tested it in a 4 star point opening as well, where the results were similar.

Edit: The bold part says why I think bot analysis is great for learning: even if knowledge was passed on to me for the right reasons, I may have remembered it for the wrong reasons. I may have learned that playing on the 2nd line in the opening is bad per se. Autonomous analysis with a strong player telling you where (but not why) it really goes wrong, can help unlearning wrong insights.

https://senseis.xmp.net/?CatenaccioJoseki

Bill Spight · Post by **Bill Spight** » Wed Dec 04, 2019 8:34 am

Here is a remarkable game — remarkable in the number and size of its blunders — that pertains to the question of large points (oba) in the opening, all of which had been played by move 11. It is a game between two pro 2 dans trying out the New Fuseki in 1934. That explains the number of blunders, as they were not very strong and were in unfamiliar territory. By comparison with a play in a corner, all of the oba lost 10% in winrate or more, by Elf's reckoning in its commentary. I wonder if the oba plays would have been regarded as so questionable before the AI era. I have amended the Elf commentary file to include winrate losses by comparison with Elf's top choices in the opening. Tesuji means that the play was Elf's top choice.

To meet the size requirements I have deleted several secondary variations.

Knotwilg · Post by **Knotwilg** » Wed Dec 04, 2019 9:12 am

Bill Spight wrote:Here is a remarkable game — remarkable in the number and size of its blunders — that pertains to the question of large points (oba) in the opening, all of which had been played by move 10.

Is your "oba" = playing in the largest space available between two stones on the board?

Bill Spight · Post by **Bill Spight** » Wed Dec 04, 2019 9:34 am

Knotwilg wrote:
Bill Spight wrote:Here is a remarkable game — remarkable in the number and size of its blunders — that pertains to the question of large points (oba) in the opening, all of which had been played by move 10.
Is your "oba" = playing in the largest space available between two stones on the board?

This online go dictionary, http://www.godictionary.net/term/ooba.html , indicates that it is a wide extension of high value. According to that definition a splitting play is not an oba, but since it prevents a wide extension of high value, I have always regarded it as such.

columbo · Post by **columbo** » Wed Dec 04, 2019 10:37 am

This is a fascinating thread and a fascinating topic. Center stage is a very real debate about what to do with vast swaths of received "go wisdom," and both sides have very compelling arguments.

I won't pretend to have anything new to add as to the substance of those arguments, but I'd like to urge some caution, as the chosen course of action here has implications for the entire English-speaking go community. If you (or anyone) decides that conventional wisdom must all be tested for purity by the furnace of AI--and proceed to purify the online resources compiling those analyses and teachings--you had better be sure that the AI analyses are actually instructive. Of course some of them are (the "correct" 3-3 invasion sequences being nearly everyone's go-to examples, or the general caution against being pincer-happy), but the body of go wisdom (the tradition, we might say) has the advantage of being means tested for the sake of instruction. Sometimes our search for heuristic guides and human-language-based explanations of go strategy can oversimplify, but that's the nature of human thought at work. The advantage of the body of such thought that's been handed down is in many cases (though of course not all) it's been filtered for its utility in guiding amateurs.

The hard work of converting AI insights into teachable lessons for amateurs is happening here on these forums; I believe this activity has to continue before we simply discard what's been handed down. As noted--baby, bathwater, etc.

Knotwilg · Post by **Knotwilg** » Wed Dec 04, 2019 11:55 am

columbo wrote:This is a fascinating thread and a fascinating topic. Center stage is a very real debate about what to do with vast swaths of received "go wisdom," and both sides have very compelling arguments.

I won't pretend to have anything new to add as to the substance of those arguments, but I'd like to urge some caution, as the chosen course of action here has implications for the entire English-speaking go community. If you (or anyone) decides that conventional wisdom must all be tested for purity by the furnace of AI--and proceed to purify the online resources compiling those analyses and teachings--you had better be sure that the AI analyses are actually instructive. Of course some of them are (the "correct" 3-3 invasion sequences being nearly everyone's go-to examples, or the general caution against being pincer-happy), but the body of go wisdom (the tradition, we might say) has the advantage of being means tested for the sake of instruction. Sometimes our search for heuristic guides and human-language-based explanations of go strategy can oversimplify, but that's the nature of human thought at work. The advantage of the body of such thought that's been handed down is in many cases (though of course not all) it's been filtered for its utility in guiding amateurs.

The hard work of converting AI insights into teachable lessons for amateurs is happening here on these forums; I believe this activity has to continue before we simply discard what's been handed down. As noted--baby, bathwater, etc.

Thanks for the caution. Two points:
- I'm more inclined to clean up/correct SL if the diagrams are provided by amateurs like myself than when copied from original pro teachings
- if the material taught is significantly different than the course of action observed, the teaching material will be discarded anyway in due time; in the 90s with abundance of 4-4 joseki, the interest in Ishida's dictionary was declining too

It seems that many people fear that we will say the approach is wrong and the 3-3 invasion is correct, or that an advice is wrong because the bots don't consider it. No, we may say that the 3-3 invasion is more popular by bots (and pros) and try explaining why that may be. And we may force the alternative upon the bot and see how it evaluates the move and try understanding why it wasn't part of its policy. And if a diagram sends out a message that one play is better than another, I believe being skeptical about such firm truth if the bots say both are worse than something else by 10%, that's just healthy critical thinking.

I asked for opinion and I get advice for caution, so I'll take that to heart.

Kirby · Post by **Kirby** » Wed Dec 04, 2019 12:30 pm

Knotwilg wrote: I tested it in LZ/KataGo, early enough and in an opening where the play would make sense from both sides. First of all, we stay well below the 10% threshold, of course, even below 5%, or below 1,5 points as per KataGo. Secondly, it's not the moves which directly result in low stones that are considered "bad". I know this is not what pros/teachers said at the time but I could have thought that. The "bad moves" are the pincer and the jump:

$$W Catenaccio joseki
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . X . . . . |
$$ | . . . O . . . . . , . . . . . , X . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . 2 . 7 . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . 3 . 1 . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . 4 . X . 5 . |
$$ | . . . . . . . . . . . . . . . . 6 . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------
Click Here To Show Diagram Code
[go]$$W Catenaccio joseki $$ --------------------------------------- $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . X . . . . | $$ | . . . O . . . . . , . . . . . , X . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . . . . . . . . . . . . . . 2 . 7 . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . 3 . 1 . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . O . . . . . , . . . 4 . X . 5 . | $$ | . . . . . . . . . . . . . . . . 6 . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ ---------------------------------------[/go]

For simplicity, let's assume that, given

, the remaining sequence given here is optimal for both sides (questionable due to the stated loss with the pincer of

, but that loss is somewhat small). You've indicated that the slides don't result in significant loss.

Then isn't it possible that this sequence, which optimally results in slides on the second line, is inferior to white's 3-3 invasion, because the 3-3 invasion here is better than this sequence which results in white's two plays on the second line? I get that LZ prefers the 3-3 invasion. And I get that the percentage dip happened on the jump and not the slide. But if playing optimally from the jump results in two slides, those two slides could be considered as part of the expected result, and could be considered in the reasoning to say that this sequence is bad.

Imagine:
LZ (explaining): "No, don't play the jump here; that'll just end in two slides on the second line. Play the 3-3 instead, because that's better than playing on the second line here."
You play the jump anyway.
LZ (*sigh*): "Ok, that's a 10% loss, because this sequence ends in that second line play shape, which is worse than the 3-3"

Looking just at the percentage dip, you can get that the jump was a bad idea. But you don't have enough information to say *why* it's a bad idea. Is it just because the jump itself is bad in and of itself? Or is it because the expected sequence (that doesn't lose additional points) results in something inferior to the expected sequence resulting from 3-3? Or perhaps these are equivalent!

---

Think of playing out a ladder as another example: You could say that playing a move to escape from a ladder in futile is bad, because the ladder isn't working for you. Or you could explain it and say that, "the only way that move will work out is if you can escape from the atari. and if you play all that out, the resulting sequence is you get captured. so playing here is bad, because getting captured is bad." The move that dips the % is probably the first one that tries to escape from the ladder. But if you're looking for a *reason*, maybe it's the resulting position, sometimes.

---

My point here is that, if the second line slides "go with" the optimal sequence after the jump, then it's not necessarily unreasonable to say that the slides are bad, if that's what you expect to result from that sequence... ¯\_(ツ)_/¯

Gomoto · Post by **Gomoto** » Wed Dec 04, 2019 1:16 pm

“All that glisters is not gold; often have you heard that told.”

“AI that glisters is not gold; often have you heard that told.”

I don't toss away my go books. But I don't rewrite them either.

Gomoto · Post by **Gomoto** » Wed Dec 04, 2019 1:19 pm

3-3 is better than approach / approach is better than 3-3

All this urge to teach.

At least AI told us already, we are talkin about a huge difference, if we do not follow the one or the other advice.

Knotwilg · Post by **Knotwilg** » Wed Dec 04, 2019 3:03 pm

Kirby wrote: You've indicated that the slides don't result in significant loss.

No no no! After posting, I realized my post could be misleading.

Kirby wrote:
Then isn't it possible that this sequence, which optimally results in slides on the second line, is inferior to white's 3-3 invasion, because the 3-3 invasion here is better than this sequence which results in white's two plays on the second line?

Yes yes yes! That's actually what I implied, the jump is worse than the 3-3 invasion, probably because it needs a base and that base can only be found on the second line.

BUT, I still don't rule out the fault only partly lies with the fact that one needs to play on the second line, but also partly with the loss of the corner and perhaps even, on a higher level, with early commitment.

AND, I could still envision a pattern that includes a play on the second line, but is still good. I remember an astonishing sequence of successive 2nd line plays, which I would never think of because I would be "crawling on the line of defeat" but LZ valued the goal of that crawling, connecting two weak groups.

Kirby · Post by **Kirby** » Thu Dec 05, 2019 7:58 am

Knotwilg wrote:That's actually what I implied, the jump is worse than the 3-3 invasion, probably because it needs a base and that base can only be found on the second line.

Ah, ok. Maybe I misunderstood what you meant. The question of "why" a move is good or bad is fascinating. In some ways, for both humans and computers, I suppose it comes down to, "because my model of how go works tells me so". Proverbs, like the one in the SL page that's referenced here, are examples of such models, though, they may often be oversimplified - or maybe in some cases, simply wrong (play the last big point, hane at the head of two stones, don't do early 3-3 invasion).

I believe that there are often cases where proverbs like these have some degree of truth, but are oversimplifying things. Thinking in terms of statistics, I'd say that many of these models are prone to being "underfit" - they don't sufficiently describe reality.

LZ is more flexible, because you can take an arbitrary board position, and see what the AI thinks is good. You have a lot more detail here, and more nuances are taken into account.

From that perspective, just using LZ to make new proverbs may have value. That being said, I think we should be careful of overfitting (https://en.wikipedia.org/wiki/Overfitting) our internal models of how go works from looking at LZ analyses. If we look at, say, 10 examples to understand the nuances of a position, there's a good chance that some of the inferences we draw from LZ include bias toward exceptional scenarios that may not be true in the general sense. To reduce that effect, I guess the best thing to do is to analyze more and more games with LZ in order to reduce the impact of outliers.

It's from this idea that I'm wary of dropping some of the proverbs that are out there. Yes, LZ and other bots may show some deficiencies - the models given by proverbs may underfit reality; you can find several exceptions to the rule stated by the proverb. But the same danger is still present with LZ analysis, if your intent is to create a general rule or heuristic.

That being said, maybe the types of models and heuristics we're learning from bots are already more useful than some existing human proverbs. But it's hard for me to really measure that.
¯\_(ツ)_/¯

Knotwilg · Post by **Knotwilg** » Thu Dec 05, 2019 9:42 am

I have performed a master edit and a rename of the SL page leading to this discussion. Please check if you find this rendition satisfactory.

https://senseis.xmp.net/?TedomariExercise1

Life In 19x19

reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism

Re: reviewing SL articles using LZ and criticism