Principles from Basic Endgame Trees (Daniel Hu)

Bill Spight · Post by **Bill Spight** » Fri Apr 23, 2021 6:25 am

John Fairbairn wrote:
deiri values are fine, and perhaps easier for most people to handle....As we know, playing the averages by choosing the theoretically largest play is typically correct, and not often wrong
If they do try to play an endgame properly, I suspect most people this way most of the time - perhaps even pros, too. Calculating the numbers is indeed easy enough. What is not easy is to form a view on how much it matters (on average) when using these values turns out to be wrong.

Yes. We pretty much have to rely upon experience there. In a way it is a statistical question, but questions such as the universe of players arise. If your opponents do not punish your errors, making them isn't so bad.

John Fairbairn wrote:Such a view by an endgame expert is one of the things I have in mind when I argue in favour of words and not purely numbers in the exposition of boundary play.

Not having good numbers does not mean that words are better. Since experience matters, feelings and hunches may be best.

If, for example, it turns out that at low dan level the most you can lose on average in a typical endgame against a player of your level (who makes the same sort of mistakes) is just 5 points, it would be very useful to know that. MUCH MUCH more useful than being told how to calculate fractions of a move value,

Based on my experience, starting from the mid-endgame, where plays gain on average around 5 points, low level dans chuck around 15 points. To anticipate the discussion below, if they play solidly they can probably cut that loss in half. If their opponent does not play solidly, the question then becomes how well they can punish their opponent's mistakes.

John Fairbairn wrote:I can understand that endgame books are usually written by people of mathematical bent

It is painfully obvious that most of them are not.

John Fairbairn wrote:But I say they have to be less concerned with what other mathematicians might think of them and more concerned with how they can help the typical reader.

Although Mathematical Go sold pretty well, especially in Japan, I felt that it suffered from having a dual audience, one of mathematicians and one of go players. It was not easy to satisfy both.

Back in 2011 I decided to write an endgame book aimed at middle SDKs, and assembled a group of readers who would give me feedback. To my surprise, it wasn't the math that was the problem, it was the go.

John Fairbairn wrote:
Theory only provides heuristics.
I'm sure I'm part of a chorus here: what are they?

The obvious one is the strategy of playing the largest play, AKA hotstrat. The main exception comes when there is a significant drop in the global temperature. It will often be worth playing to get the last play before that drop. This is something that go players noticed long ago, leading to the advice to get the last big play of the opening, the last play of the large yose, and the last play of the game. For the last play of the game, there are both absolute plays and heuristics. My research with the Elf files suggests that getting the last big play of the opening is almost worthless advice. There is in general no significant temperature drop between the opening and the middle game. I have, however, come up with the following last play advice, which is to occupy the last open corner, as a rule. I know that occupying any open corner is said to be good, but in the pre-AI era pros often left the last open corner unoccupied in order to fight elsewhere. That was usually an error. Getting the last large yose does seem to be good advice. More on that later.

Professor Berlekamp came up with a heuristic he called sentestrat, which is basically to answer your opponent's sente if it raises the global temperature. The idea is to limit your losses. Berlekamp was around 3 kyu, and I privately scoffed at the idea, dubbing it gotestrat. As a dan player I felt that I was duty bound to find a good intervening play, if I could. Sentestrat sounded like trying to make a virtue of followitis. Berlekamp was right about limiting your losses, OC.

But consider your question about how many points a low dan chucks in the endgame. Sentestrat is a way that they could limit their losses. It goes against the grain, but there we are. High dans and pros may interpose a play instead of meekly replying to an opponent's sente. OC, we cannot say that they are wrong to do so, but the result is often a fairly long term increase in the global temperature. When that occurs, it will frequently become important to get the last play before the temperature drops back down. This is why I say that getting the last big yose is often a good idea.

This kind of phenomenon cannot simply be observed by words. It is necessary to have the technical chops to do so.

Today's top bots do not often seem to play thickly, but play lightly and flexibly. That style of play may be difficult for humans to emulate, given our propensity for depth first search. However, especially at the large yose stage, it behooves us to be on the lookout for temperature drops and to consider tenuki when that happens.

John Fairbairn wrote:
However, by the 1970s players and writers realized that there was a theoretical problem with double sente.
That doesn't seem to square with Edo players being so good at the endgame.

The reputation of the Edo players is mostly intact. True, the bots may differ, but the large yose is one place where humans, given enough time, may play better than today's bots. By the 20th century, with 10 hours of time, I noted in the 1970s that top pros rarely made errors when the average gain dropped below 3 points. 3 points is in the middle yose range.

John Fairbairn wrote:But, leaving that aside I'll make a couple of other points. One is that you say in the thread you referenced that O Meien didn't talk about double sente. He did, twice, and on one occasion said he felt smug about playing it.

I acknowledged that at the time, didn't I? But O Meien did not include double sente as a classification for calculating the size of plays, which was, and is, my concern. Kano in the 70s and the Nihon Kiin, as late as 2000, tripped up badly in that regard. As I said above, it is not difficult to find examples of double sente in practice. The problem is that it is too easy to find false examples if you believe in double sente.

John Fairbairn wrote:Switching back to words over numbers, one of the most noticeable features of pro commentaries towards the end of a game is that they VERY rarely mention numbers (for move sizes) but they BVERY often describe a move or a position as thick, and therefore advantageous.

As you know, I gradually developed a thick style of play. The bots would probably regard it as heavy, what can I say? But one advantage of playing thickly in the large yose, or maybe earlier, is that it frees you play enterprisingly to take advantage of your opponent's misplays. Also, IMX, when the average gain drops to around 7 points is often when plays in the corner, often involving ko, become significant again. Thick play prepares the way for ko fights by reducing the number of possible ko threats by the opponent.

John Fairbairn wrote:Now it seems to be that if it is worth harping on about gaining a point or a fraction in the micro-endgame (not that I think it is, but there's an 'if' in there), then it must be worth a humungous amount more to study the earlier and bigger stage when thick moves or positions become an issue. It would be immensely valuable, therefore, to have an authoritative endgame writer write a book or a chapter on how and when to play such thick moves, how to recognise andr appreciate them, and so on. There's vast number of examples in the literature. This exposition would have to be done mostly in words, of course, but that underlies my point: words are more useful than numbers for many people. Instead of improving by a point two you can look forward to improving by a grade or two.

In vol. 6 of Sakata no Go, which is about endgame calculation, Sakata shows a good example of thick play in the endgame. That play allowed Sakata to continue enterprisingly. To show the advantage of the thick play, you cannot just rely upon words, you have to have the technical chops.

This is speculative, but it may well be that thick play tends to produce positions where humans, even strong amateurs, can play better than today's bots. I doubt if I well be able to research that question, but I think it is worth doing.

RobertJasiek · Post by **RobertJasiek** » Fri Apr 23, 2021 7:08 am

Bill Spight wrote:Back in 2011 I decided to write an endgame book aimed at middle SDKs, and assembled a group of readers who would give me feedback. To my surprise, it wasn't the math that was the problem, it was the go.

What was problematic about the go (theory)? That you had thought to have it ready but found that you did not have it ready yet?

The reputation of the Edo players is mostly intact.

As is KataGo's? However, it may simply be that nobody has studied their endgame play carefully enough to verify it against modern theory. I may do it some time if I find the necessary time.

Bill Spight · Post by **Bill Spight** » Fri Apr 23, 2021 8:17 am

RobertJasiek wrote:
Bill Spight wrote:Back in 2011 I decided to write an endgame book aimed at middle SDKs, and assembled a group of readers who would give me feedback. To my surprise, it wasn't the math that was the problem, it was the go.
What was problematic about the go (theory)? That you had thought to have it ready but found that you did not have it ready yet?

No, my readers understood the math just fine. But they did not find the plays, which I wouldn't even call tesuji. {shrug}

John Fairbairn · Post by **John Fairbairn** » Fri Apr 23, 2021 8:37 am

you cannot just rely upon words, you have to have the technical chops.

Bill: But I think I learned more from your one post (thank you) than I've learned from all other endgame writers excluding O Meien. Why? Because of the words.

Technical nous matters of course, but I think it matters far more for the person framing the advice (as it should) than for the end user. To repeat my previous analogy, to drive a car I need some basic technical knowledge (gear shifts, how to fill up with petrol, etc), but I don't have to know how the differential works - or even where it is - or even if it's there or not.

Cassandra · Post by **Cassandra** » Fri Apr 23, 2021 9:02 am

RobertJasiek wrote:As is KataGo's? However, it may simply be that nobody has studied their endgame play carefully enough to verify it against modern theory. I may do it some time if I find the necessary time.

KataGo does N O T care about points!

In this sense, it might be a useless tool for you (during "play" / "analyse"), if you ever wanted to use its assistance for the verification of some parts of your "modern theory".
If KataGo has never encountered an endgame position in question (nor anything comparable), it is quite likely that it will be unable to find the "correct" (your understanding) endgame sequence, no matter how powerful the machine is it runs at.

If you ever wanted KataGo to "solve" such kind of endgame positions, you would have to go back to the selfplay training of the net, and feed the training with appropriate sample material.
The underlying problem is absolutely the same as with solving Igo Hatsuyôron 120.

If you let KataGo play games, I am very sure that it will never reach any of the endgame positions, where your "modern theory" possibly matters (just as it would never reach the starting position of Igo Hatsuyôron 120).
And this might be also true for games of the old masters, which are said to have had a similar style as modern AI.

Bill Spight · Post by **Bill Spight** » Fri Apr 23, 2021 10:04 am

Cassandra wrote:
RobertJasiek wrote:As is KataGo's? However, it may simply be that nobody has studied their endgame play carefully enough to verify it against modern theory. I may do it some time if I find the necessary time.
KataGo does N O T care about points!

In this sense, it might be a useless tool for you (during "play" / "analyse"), if you ever wanted to use its assistance for the verification of some parts of your "modern theory".
If KataGo has never encountered an endgame position in question (nor anything comparable), it is quite likely that it will be unable to find the "correct" (your understanding) endgame sequence, no matter how powerful the machine is it runs at.

One thing that I have noticed with the Elf annotated files is that Elf will frequently judge the human response more favorably than its top choice, especially when the human response was not on Elf's radar. Usually this may be considered noise, but it yields the possibility of coming up with alternative lines of play that may be better than the original ones produced by Elf. I don't know if that happens with KataGo, but I would try setting the komi for KataGo so that the winrate estimate is approximately 50% and seeing what happens. I am not talking about rare or unusual positions, but fairly common and everyday ones.

Cassandra · Post by **Cassandra** » Fri Apr 23, 2021 10:50 am

Bill Spight wrote:One thing that I have noticed with the Elf annotated files is that Elf will frequently judge the human response more favorably than its top choice, especially when the human response was not on Elf's radar. Usually this may be considered noise, but it yields the possibility of coming up with alternative lines of play that may be better than the original ones produced by Elf. I don't know if that happens with KataGo, but I would try setting the komi for KataGo so that the winrate estimate is approximately 50% and seeing what happens. I am not talking about rare or unusual positions, but fairly common and everyday ones.

Dear Bill,

I am sorry, but I can only report on my experiences with the combination of KataGo and Igo Hatsuyôron 120, and here below especially for positions, for which we assume that KataGo has not sufficiently encountered these during the selfplay training. Simply because these positions are quite interesting for further comperative analysis, but "hidden" behind several mistakes of both sides (so there is no reason for KataGo to visit especially these positions during training). And yes, Igo Hatsuyôron 120 is kind of special, just because some decisive effects in the longish sequences seem to be far behind KataGo's event horizon, and so unavailable during "play" / "analysis".

You are right with emphasising the need to vary the komi during analysis. There will be cases where you will have to do so per move.
In my experience, it is especially important to achieve a win rate level (around 50 % likely) that stops KataGo from considering / favouring "desperate measures" for the side that seems to be too large behind for winning the game. These "desperate moves" are likely to result in an even worse result at the end of the game (human understanding, in points). On the other heand, you cannot trust that the side that is largely ahead (in win rate) favours the "best" move (human understanding, in points).
During this process, you must forget what you know about the position / result. It seems to be best to let KataGo believe -- based on its OWN analysis, not the absolut truth -- that the current position in nearly balanced.

However, you will encounter positions / sequences that are extremely sensible with varying the komi. In these cases, you will have to go back and forth some times, until you have a better understanding of what is going on.

There will be also positions, which ensuing sequences have several "local" maxima (human understanding, in points => several lines for winning the game, but with different scores at the end). If you vary the komi in these cases, it is likely that KataGo's favourite moves / sequences differ, dependent on which "local" maximum (being enough for winning the game with the current komi) is the nearest.

With Igo Hatsuyôron 120, we have the large advantage that we (believe to) have a quite good understanding what is going on, and indeed can find moves (sometimes) that are better (human understanding, in points) than what KataGo recommended, based on the current knowledge of the special net.
Based on this understanding, we can provide the selfplay training with suitable material for the database of starting positions to be examined (besides much more that KataGo chooses automatically). It needs a long time until KataGo has had so many selfplay games started from these positions that it will sustainable adjust its network, and also come up with the then "correct" line of play during "play" / "analysis".

To come back to the beginning:
The effect you described with Elf might have a similar reason. In a position that is "unknown" to the network used I assume it definitely possible that a human choice will be assessed better than the AI's favourite, once the stone has been put on the board. If the AI cannot rely on the knowledge in his network, but is dependent on its processing power only, you cannot be sure that the AI's favourite is really the "best" move.

lightvector · Post by **lightvector** » Fri Apr 23, 2021 1:55 pm

Cassandra, I think you might be overfitting too much to your experience on igoh120. It is rare for normal games to have such massively globally entangled tactics at the point where the endgame values of moves becomes small. And it's rare in normal games for the effect of every endgame exchange to be masked behind a long sequence of liberty filling of a huge capturing race before the position visually collapses down to the final result.

KataGo does care about points, about as much as human pros care about points in tournament games. Which is to say - just like human pros, KataGo will give up small amounts to play thickly when ahead, and escalate to risk large losses sometimes when looking for a place to resign. But it won't obviously play locally suboptimal moves, defend where completely pointless, fill in its own territory, etc, like some prior bots. And when ahead, it will still take free points when offered, kill groups if it is not risky, play decent basic endgame, etc - playing accurately enough so that even if not perfectly optimal, against amateur players, it tends to continue to gain many points in the endgame even under the constraint of not doing anything tremendously risky.

lightvector · Post by **lightvector** » Fri Apr 23, 2021 2:47 pm

@Bill or anyone else doing analysis - Iif you're doing any form of analysis, I recommend setting "analysisWideRootNoise" to a value like 0.05, or 0.10. This setting makes KataGo explore much more widely at the root, judging a lot more moves, and makes it a little less likely that a move is missed in the way you described.

You can also easily configure KataGo to search wider in general, not just at the root. For example, "nnPolicyTemperature=1.1" and "cpuctExploration=1.5" and "cpuctExplorationLog=0.6". This will also slightly reduce the risk of missing improved moves within the tree. Or try yet-more extreme numbers, play with it and see.

The reason these settings aren't default, of course, is that they make play worse on average, holding compute fixed. Because for play, you care a little bit more about finding a move that's good enough and being a bit more sure of the move you're planning to play that there is nothing wrong with it when you go deep. Whereas for analysis, depending on the goal of your analysis, maybe you care more about comparing alternatives, or finding the most likely optimal move, rather than a good-enough move, at the cost of searching less deeply and being less sure about any individual move that you might pick.

Cassandra · Post by **Cassandra** » Fri Apr 23, 2021 7:26 pm

lightvector wrote:KataGo does care about points, about as much as human pros care about points in tournament games. Which is to say - just like human pros, KataGo will give up small amounts to play thickly when ahead, and escalate to risk large losses sometimes when looking for a place to resign. But it won't obviously play locally suboptimal moves, defend where completely pointless, fill in its own territory, etc, like some prior bots. And when ahead, it will still take free points when offered, kill groups if it is not risky, play decent basic endgame, etc - playing accurately enough so that even if not perfectly optimal, against amateur players, it tends to continue to gain many points in the endgame even under the constraint of not doing anything tremendously risky.

Dear lightvector,

I am afraid that my brief statement was too bold.

In my understanding, the training of AI programs (i.e. the network) follows the main goal of "winning the game" (as this is the final measurand at the end of a selfplay game). In doing so later during "play" / "analyse", the AI will optimise its prospects for winning the game, as you described in detail, and similar to what human professionals are used to do in important games.
(Probably too) Strongly abbreviated, this behaviour would correspond to "Care about win rate". E.g. avoid moves whose potential benefits (in points) are out of proportion to their likely risk (in percent). Do not struggle to maximise the final score, no matter the costs.

However, "winning the game" (seen globally) cannot be the principle object for "modern endgame theory", which is much more likely to stress the maximisation of the final score of the game.
Board positions for deriving and / or application of the "scientific" principles might be as artificial as the starting position of Igo Hatsuyôron 120 is.
Board positions, which were created for demonstrating that programs are not perfect in the endgame, will be artificial as well. "Solving" that endgame "problem" would be required in principle, and so would run into the same basic difficulties as with solving a very special classic middle game problem.

RobertJasiek · Post by **RobertJasiek** » Sat Apr 24, 2021 3:20 am

Back to the paper, chapter 1:

Chilling applies to all trees interatively by imposing the tax on the currently moving player.

Chapter 2:

Instead of the exercise, I would like to see two theorems and their proofs. Later, I will publish a proof related to the second part of the exercise for simple gotes without follow-ups (or real numbers, if you prefer) and decreasing-or-constant move values. For that, induction is not needed. Took me 20 minutes to prove. (Several others failed to prove this a couple of years ago at the DGoB forum. Taking the larger cake or calling it obvious do not count as a proof.) I suppose, proving for arbitrary gote trees must consume more time.

RobertJasiek · Post by **RobertJasiek** » Sat Apr 24, 2021 3:44 am

Section 3.1:

n seems to be the number of local endgames but what is 'a'? When considering the three games abd, bac, bca, how can one start with a but the others start with b when there is only one (n=1) local endgame?

What boundary and why at a = c?

I cannot verify the optimim scores and their conditions as long as it is unclear how many and what local endgames we have.

Principles 2+3 are informal and I suspect counter-examples exist but it depends on clarification as before.

Principle 4 and its argument are plainly wrong.

Consider as counter-example the local gote endgame (in move value | follow-up move value annotation) 100|1 and another local gote endgame 4|2.

RobertJasiek · Post by **RobertJasiek** » Sat Apr 24, 2021 3:59 am

Section 3.2:

"Decreasing difference" I and some number theorists studying such things call it "alternating sum" because the sign of each summand alternates.

Proposition 3:

Basic and important. I have proved such but would expect your proof if you state it as proposition. Right, telescoping terms are useful here but this phrase is new to me. Have you invented it or is it used regularly?

RobertJasiek · Post by **RobertJasiek** » Sat Apr 24, 2021 4:12 am

Proposition 4:

Basic and useful. The proof is straightforward. I have been using such, too. I suppose |A+| means number of numbers in A+.

I prefer shorter annotation:

∆A := ∆(A)

A|b := A ∪ { b }

RobertJasiek · Post by **RobertJasiek** » Sat Apr 24, 2021 7:13 am

Lemma 5:

How is the proof trivial? Either you or I make a sign mistake. Here is my attempt of a proof:

(Each u is in superscipt. Sorry, without mouse, marking individual letters is too difficult on iPadOS.)

∆A - ∆A|b
= ∆A+ + (-1)u∆A- - ∆A+ - (-1)u(b - ∆A-)
= (-1)u∆A- - (-1)ub + (-1)u∆A-
= (-1)u2∆A- - (-1)ub

Life In 19x19

Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)

Re: Principles from Basic Endgame Trees (Daniel Hu)