Find the best move (AI will be no help)

Bill Spight · **#21**

Uberdude wrote:

Bill Spight wrote:

Normally we expect that between evenly matched opponents the first player's advantage will increase as their level of play increases. The opposite is the case here.

Bill, you've said this before, but I still don't see why you say that. If the 7.5 Komi is slightly too big, as seems to be the case, then as a bot gets stronger I would expect its winrate for White on the empty board to increase as the stronger player is better able to carry that advantage through to the endgame.

Well, this is one reason why knowing the margin of error of winrate estimates is important. Back in the 1970s someone wrote an article in the AGA Journal claiming that the proper statistical komi was 7. (Actually, the author did not qualify as I did. He said that it was "as plain as a pikestaff" that komi was 7.) Ing's statistics indicated to him that for area scoring it was 8. In the early 1980s some people thought that for area scoring it was as high as 9.

As for today's bots, they seem to believe that 7.5 komi is a little too high, giving an advantage to White. However, as far as I can tell, the estimated advantage is within the margin of error. Winrate estimates always assume errors, we just don't know much about those errors.

We know that komi is roughly half of the temperature of the empty board, of how much the initial gote or reverse sente gains. That's why we guess that the first play gains around 14 pts. However, suppose that Black is not good enough to gain that much, but only gains 12 pts. Maybe Black is a kyu player. White is at the same level. Then we can estimate the statistical komi for them as only 6 pts. instead of 7. For random players statistical komi might be 3 pts. If both players are so weak that their statistical komi is 6 pts. and Black gives 7.5 pts. komi, then White will have an advantage. Now suppose that both players improve, so that their statistical komi is 7. Then White's advantage will be less.

We know that as bots get better, their statistical komi will approach correct komi, the board result with perfect play. If correct komi is 7, then as they get better Black will not gain an advantage from that. But if it is 8 or 9 then Black will. On an odd parity board correct area komi is unlikely to be 8, but it is possible. I don't think that it is 9, but quien sabe?

Now, none of today's top bots has given an initial advantage to Black, but 3 or 4% is likely to be within their margin of error. If LZ now estimates White's winrate as 57% with 7½ komi, that estimate is likely to be greater than the margin of error. If so, that's news.

Yakago · **#22**

Would you care to elaborate what you mean by margin of error? (although you may have done so before)

Wouldn't the term 'margin of error' imply that there is a 'true' winrate? With respect to perfect play, that makes no sense.

So then it would have to be the winrate of what? Some match with fixed parameters of a certain bot? The parameters should greatly affect the outcome.

In any case it's not too difficult to test

Bill Spight · **#23**

Yakago wrote:

Would you care to elaborate what you mean by margin of error? (although you may have done so before)

It is true that I have used margin of error to refer to different things. Here I am taking a player's winrate to be an estimate of the probability of a win by that player, given perfect play from the given position, and given our state of knowledge at this time. That is not exactly how the winrate is defined. It is an estimate of the probability of a win by that player, given self play by the bot in question from the given position, given our current state of knowledge. OC, we do not assume that the bot plays perfectly, but it is taken as playing as close to perfect play as we can come at this time. (For weaker bots I would not use margin of error in this way.)

Quote:

Wouldn't the term 'margin of error' imply that there is a 'true' winrate? With respect to perfect play, that makes no sense.

Actually, it does in Bayesian terms, since the probability is conditioned on our state of knowledge. You may not want to call that a true winrate.

Quote:

So then it would have to be the winrate of what? Some match with fixed parameters of a certain bot? The parameters should greatly affect the outcome.

In any case it's not too difficult to test

I agree. It just takes time and careful research. By which time there may be a new bot on the block. Nobody seems interested in doing that research. A lot of people take winrates as gospel. I talk about margin of error to emphasize that they are not.

Uberdude · **#24**

Bill Spight wrote:

Yakago wrote:

Would you care to elaborate what you mean by margin of error? (although you may have done so before)

It is true that I have used margin of error to refer to different things. Here I am taking a player's winrate to be an estimate of the probability of a win by that player, given perfect play from the given position, and given our state of knowledge at this time. That is not exactly how the winrate is defined. It is an estimate of the probability of a win by that player, given self play by the bot in question from the given position, given our current state of knowledge. OC, we do not assume that the bot plays perfectly, but it is taken as playing as close to perfect play as we can come at this time

Isn't that definition of margin of error thus only 100%-x or x% because with perfect play it's either a win or a loss?

When speaking of margin of error I think of these more tractable ideas that one could actually measure:
1) sampling error, as in what is the variance of the measurement I am taking (which is e.g. what does LZ network 234 give as the winrate for this move after 20k playouts). I can repeat that measurement several times, maybe use a different environment like a different graphics card or different number of threads so there's a little variation from the randomness of multi-threading but in my experience this tends to be quite small, not more than a percentage point or two.
2) error in using fewer playouts as an estimation of what the bot would think with more playouts. e.g. how good is giving LZ 20k playouts at estimating what LZ would think after much more analysis. I'm not even sure if AG-0 algorithm asymptotes to perfect play with infinte time like classic MCTS is supposed to, but let's say LZ can play at its best at 1 billion playouts. Then the margin of error is asking, how good is asking LZ the winrate after 20k playouts at estimating what LZ would think after 1 billion? This error can be larger than above, because it can take LZ many many platouts to overcome blindspots, find refutations, read ladders etc and these can make drastic changes in its evaluation of a position.

dfan · **#25**

lightvector had a couple of good detailed comments a few months ago (1, 2) about all the difficulties (or at least a lot of them) involved in trying to define and/or calculate "margin of error" when it comes to win rates, which are good background for anyone interested in the topic.

Bki · **#26**

When I think about the margin of error, it's obviously compared to what the real probability of the bot winning from this position is (EDIT : through from what Lightvector said apparently I might be wrong in this). Which is something that clearly exists (even if we don't know it exactly). And everything else really has no meaning given that it is the only thing that the bot estimate.

Uberdude wrote:

Bill Spight wrote:

Yakago wrote:

Would you care to elaborate what you mean by margin of error? (although you may have done so before)

It is true that I have used margin of error to refer to different things. Here I am taking a player's winrate to be an estimate of the probability of a win by that player, given perfect play from the given position, and given our state of knowledge at this time. That is not exactly how the winrate is defined. It is an estimate of the probability of a win by that player, given self play by the bot in question from the given position, given our current state of knowledge. OC, we do not assume that the bot plays perfectly, but it is taken as playing as close to perfect play as we can come at this time

Isn't that definition of margin of error thus only 100%-x or x% because with perfect play it's either a win or a loss?

When speaking of margin of error I think of these more practical ideas:
1) sampling error, as in what is the variance of the measurement I am taking (which is e.g. what does LZ network 234 give as the winrate for this move after 20k playouts). I can repeat that measurement several times, maybe use a different environment like a different graphics card or different number of threads so there's a little variation from the randomness of multi-threading but in my experience this tends to be quite small, not more than a percentage point or two.
2) error in using fewer playouts as an estimation of what the bot would think with more playouts. e.g. how good is giving LZ 20k playouts at estimating what LZ would think after much more analysis. I'm not even sure if AG-0 algorithm asymptotes to perfect play with infinte time like classic MCTS is supposed to, but let's say LZ can play at its best at 1 billion playouts. Then the margin of error is asking, how good is asking LZ the winrate after 20k playouts at estimating what LZ would think after 1 billion? This error can be larger than above, because it can take LZ many many platouts to overcome blindspots, find refutations, read ladders etc and these can make drastic changes in its evaluation of a position.

On the 1), you can actually get a theoretical upper bound on the variance (0.25). Then it's pretty easy to determine an upper bound of confidence intervals. Of course, you can do even more work to be more precise, but statistical inference isn't my domain, and really, it's already good enough to know, for example, that the radius of a 95% confidence interval (for example) is ~3% with 1000 playouts and ~1% with 10000.

Bill Spight · **#27**

Uberdude wrote:

Bill Spight wrote:

Yakago wrote:

Would you care to elaborate what you mean by margin of error? (although you may have done so before)

It is true that I have used margin of error to refer to different things. Here I am taking a player's winrate to be an estimate of the probability of a win by that player, given perfect play from the given position, and given our state of knowledge at this time. That is not exactly how the winrate is defined. It is an estimate of the probability of a win by that player, given self play by the bot in question from the given position, given our current state of knowledge. OC, we do not assume that the bot plays perfectly, but it is taken as playing as close to perfect play as we can come at this time

Isn't that definition of margin of error thus only 100%-x or x% because with perfect play it's either a win or a loss?

That's the maximum error, OC. But with research we can come up with better margins of error.

Quote:

When speaking of margin of error I think of these more tractable ideas that one could actually measure:
1) sampling error, as in what is the variance of the measurement I am taking (which is e.g. what does LZ network 234 give as the winrate for this move after 20k playouts). I can repeat that measurement several times, maybe use a different environment like a different graphics card or different number of threads so there's a little variation from the randomness of multi-threading but in my experience this tends to be quite small, not more than a percentage point or two.
2) error in using fewer playouts as an estimation of what the bot would think with more playouts. e.g. how good is giving LZ 20k playouts at estimating what LZ would think after much more analysis. I'm not even sure if AG-0 algorithm asymptotes to perfect play with infinte time like classic MCTS is supposed to, but let's say LZ can play at its best at 1 billion playouts. Then the margin of error is asking, how good is asking LZ the winrate after 20k playouts at estimating what LZ would think after 1 billion? This error can be larger than above, because it can take LZ many many platouts to overcome blindspots, find refutations, read ladders etc and these can make drastic changes in its evaluation of a position.

As for 2), that's the approach I took with Leela 11 a few years ago in regard to the cheating question. I did not just assume that more playouts gave a correct answer. What I did first was show that the differences between plays and winrates for several positions was probably not random. That was evidence that the winrate differences represented real differences in evaluation, the assumption being that with more playouts the evaluation was better. That then indicated that the winrate errors of the presumed mistakes were at least 3%. As a rule thumb, then, if the winrate estimates of two plays by Leela 11 differ by less than 3% at a certain setting (100k), I do not think that we can assume that the play with the worse winrate is actually a worse play. Humans make use of winrate estimates, for better or worse. It would help if we had a good idea of how good they were.

lightvector · **#28**

Bill Spight wrote:

Uberdude wrote:

Bill Spight wrote:

It is true that I have used margin of error to refer to different things. Here I am taking a player's winrate to be an estimate of the probability of a win by that player, given perfect play from the given position, and given our state of knowledge at this time. That is not exactly how the winrate is defined. It is an estimate of the probability of a win by that player, given self play by the bot in question from the given position, given our current state of knowledge. OC, we do not assume that the bot plays perfectly, but it is taken as playing as close to perfect play as we can come at this time

Isn't that definition of margin of error thus only 100%-x or x% because with perfect play it's either a win or a loss?

That's the maximum error, OC. But with research we can come up with better margins of error.

Ummm, I'm misunderstanding something. Since Go has no inherent randomness, in a given position perfect play will always win or will always lose. So say a bot says 60%. Then as Uberdude said, if comparing to probability of winning under perfect play, either 40% or 60% is precisely the error. It is not merely the "maximum" error, one of those two *is* the error. Aside from somehow divining optimal play to know which of those two it is, there is nothing further to research and no better estimate is possible.

Did you mean something different by "perfect play" than what people usually mean by it? Or was "with research we can come up with better margins of error" already an acknowledgement the definition you gave wasn't actually the definition you wanted, and presumably the "research" would also involve finding what definition would be better? Or am I just misreading something?

Edit: As quoted, you did mention that winrates in practice are based on self-play, not perfect play, which is true. But then I don't understand what "taking a player's winrate to be an estimate of the probability of a win by that player, given perfect play from the given position", if not meaning that "perfect play" is the proposed benchmark to compare error against, which leads to the highly-non-useful 40% or 60% above.

Side note: self-play in training is very, very far from perfect play, and it is even very far from "as close to perfect play as we can come at this time", because self-play in training is very deliberately made noisy and weaker than bots are actually capable of playing (due to performance reasons, and due to noise actually being desirable in the dynamics of training).

Bill Spight · **#29**

lightvector wrote:

Ummm, I'm misunderstanding something. Since Go has no inherent randomness, in a given position perfect play will always win or will always lose. So say a bot says 60%. Then as Uberdude said, if comparing to probability of winning under perfect play, either 40% or 60% is precisely the error. It is not merely the "maximum" error, one of those two *is* the error. Aside from somehow divining optimal play to know which of those two it is, there is nothing further to research and no better estimate is possible.

What I think you are missing is that the probabilities are conditioned on our state of knowledge. Perfect play might be deterministic, but our knowledge of it is uncertain.

Here is a simple example. Suppose that we have a randomly shuffled standard deck of 52 cards. What is the probability that the 9th card in the deck is the Jack of Diamonds? Let's say that it is 1/52. Suppose now that we check the 9th card and it is the Jack of Diamonds. Does that mean that our error is 51/52? No. Given our assumptions our error is 0. That reflects our knowledge about standard decks of cards.

Now suppose that we know nothing about standard decks of cards. What is the probability for us that the 9th card in the deck is the Jack of Diamonds? All we can say is that it lies between 0 and 1. Now let us check the 9th card in randomly shuffled decks 1,000,000 times. Doing so lets us estimate the physical probability that it is the Jack of Diamonds.

IIUC, one definition of a winrate estimate is the probability that the player whose winrate it is will win with self play if the game is played out from that point. Let's say that 160 moves have been played and the winrate estimate for Black is 60% with 10k playouts. From our previous discussion I believe that you said that self play from that position is deterministic, that we cannot replay it and get a different result. But we can look at other positions after 160 moves with a winrate estimate for Black of 60% with 10k playouts. Doing so many times will allow us to estimate the error of that winrate estimate under those conditions.

We can also check the error of the difference in winrates between two plays under certain conditions in a similar manner. How often does the play with the higher winrate win while the other play loses, and vice versa? My hunch is that early in the game there are many plays that produce the same win/loss result with perfect play, and even with very good play, but if LZ say that the winrate difference between the two is 7% with a comparable number of playouts greater than 10k, I think that it is very likely that the play with the worse winrate estimate is a mistake. If LZ says that the winrate difference is only 2%, I am inclined to think that the difference could easily be noise.

Edit: If you are interested in a Bayesian approach, I can recommend The Estimation of Probabilities by I. J. Good.

lightvector · **#30**

Bill - ah, that makes sense, I see what you're getting at now. Thanks!

Yes, I'm already familiar with Bayesian reasoning, but your phrasing in your earlier posts did not suggest to me that this is what you meant. Partly since when I discuss things in a Bayesian perspective with others in actual work or in real life (which is actually not infrequently), I'm very used to everyone speaking linguistically in a way that reflects how Bayesian probabilities are model-dependent and observer-dependent - that they are a foremost a quantification of beliefs and of uncertainty rather than of objective probabilities in the world.

E.g. a phrase like "the error in the winrate as an estimate of the probability of a win by a perfect player" linguistically suggests that there is a correct probability out there in reality, and that we are comparing the winrate against that. In objective reality, the only relevant probabilities for a perfect player are 0% or 100%, so the error (i.e. the difference) between a winrate of say 60% and this must be either a full 60% or 40%, though we do not know which. And of course, this is unuseful. The object being reasoned about itself is not probabilistic, the only uncertainty is in our knowledge about it. And it is the latter uncertainty that we want to discuss and compare against, but this is not implied by the plain English meaning of that wording.

By contrast, a phrasing like "the error in the winrate relative to our best belief/credence that perfect play would win" or even just "relative to our probability for perfect play winning (given our uncertainty)" linguistically implies what we are comparing the winrate against is something to do with us and our knowledge rather than something objective.

Paying attention to this sort of word/syntax choice takes effort, but I find it helps a lot in clarity when I discuss things like this.

Bill Spight · **#31**

Ah, thanks, lightvector.

As for terminology, I was a Bayesian before the Bayesian revival, and I was not involved in it. I am also used to talking to frequentists. And I am not a subjectivist. Hence my saying that probability is conditioned on our knowledge. I am closest to Jaynes and Pearl, I guess.

In this discussion, with a number of people, it seemed to me that people had trouble with the idea of estimating probability. So I had no idea that anyone was taking a Bayesian perspective.

TelegraphGo · **#32**

Frankly, I'm entirely unconvinced that the AI "winrates" and probability have anything to do with one other whatsoever. We have a lot of fuzzy talk about it in this thread, and since I'm confused I'd like to descend to the nitty-gritty. If it's not too dry, I'd like to try to define rigorously and without probability, what Bill is talking about, so that he or anyone else can show me where precisely we differ in opinion (if we do). Feel free to skip this post if you hate math

I think it's fair to model every AI as an effectively reproducible function f: X -> Y, where X is the set of all legal board states [note that this is finite], and Y is vaguely similar to [0,1] (let's call 0 opponent win and 1 AI win for discussion). If f(x) > 0.5, then the AI believes that with perfect play it wins, if f(x) < 0.5 then the AI believes its opponent wins. Beyond that is some sort of metric of the AI's confidence, which seems to generally give more extreme values when we humans have confidence as well. We'd like to be able to take two different confidence evaluations and call the magnitude of the difference significant, so that choosing to play one variation is certainly a mistake over choosing the other.

Define g(x): X -> X, g(x) = x U k, where k is the move that has the best f(x U k) for any legal move k. This function just plays the game that the AI says is best.

Then define h(x): X -> X as perfect play - I'm going to assume perfect understanding of the AI, which could be possible if you had instant access to f(x) for all x. h(x) = x U j, where j is a legal move [not necessarily noticed by the AI] such that the worst boardstate x' that AI would play towards with its own assessment of worst f(x') is reachable via playing the move j. Since I trust the AI to always count points correctly after endgame, and any working trap must be sprung eventually, there is no other mysterious better sequence to play to further beat the AI. Therefore, for any position x, g(h(g(h(g(h.....(x))) would produce the game that leads the AI to, at some point, encounter the position it views most dimly, but would play into repeatedly from the position x. Obviously, if the AI is losing, then this state will be 0, but if the AI is winning, it may only ever drop to 0.6, 0.8, etc. before eventually climbing up to 1.

Consider the function w(x): X -> X to output the exact boardstate where f(w(x)) is the minimal assessment reachable by the compound g(h(g(h(...(x))). If we're only interested in the next "stage" of the game, then we could limit ourselves to some finite number of g(h(... and redefine. We could even limit ourselves to exactly the depth of the AI's search tree. I'm not sure exactly what Bill wants here.

Our goal is to find some number M>0 that we'll call a "margin of error" independent of boardstate, such that for any two positions x, y in X, f(x)>0.5, f(y)>0.5, [f(x)-f(y) > M] => [f(w(x)) >/= f(w(y))].

The intent of this statement is that the margin of error will tell apart the 'comfortability' of the winning moves. The restriction of viewing 'winning' positions is, I think, sensible, as knowing how to be less likely to lose is opponent-based, as opposed to knowing how to be more likely to win. A difference above the margin of error will tell us that against a 'perfect' opponent, the AI can't handle either position, the better position is either winning and the other position is not, or that the position with the higher value can go less wrong when well handled.

It's likely that this value M isn't constant, and actually depends on the value of f(x) and f(y). Bill appears to suggest that in the range [0.5, 0.75] a reasonable value for M should be around 0.04 for at least one particular AI that he studied in depth.

I'm not quite so sure about that. The very position this thread started with showed a position x for which f(x) and f(h(x)) differed by more than 0.04 already. It seems likely that a game of the AI against our elusive perfect player h(x) would result in many similar traps being laid for the AI, most outside of its own evaluation. Perhaps what we'd really like to do is consider a strategic game, where the AI isn't ever surprised by h(x), but perhaps a little outplayed.

To do this, we can consider an augmented f'(x) such that for any position where the AI misses a tesuji that h(x) finds, [that is: f(w(g(x)) < f(w(h(x)))] the AI is forced to analyze the tesuji until it reevaluates (so that f(h(x)) > f(g(x)), redefining g(x) for that x). Then, the normal search process is extrapolated for all possible preceding boardstates, so that the AI no longer misses any tesuji, but retains its signature fluid evaluation.

If we similarly define a margin of error off of this augmented AI, M'>0 s.t. for all x,y in X, f'(x)>0.5, f'(y)>0.5, [f'(x)-f'(y)>M'] => [f'(w(x)) >/= f'(w(y'))], then M' = 0.04 seems pretty reasonable to me, actually. The AI is bound to make some strategic errors, but not an overwhelming amount. This seems excessively difficult to actually evaluate though - perhaps a method such as my process finding the move the AI missed could assist you for any position x, but that seems really arduous.

Talking about move-by-move 'margin of error' [more precisely, max(f'(g(x)) - f'(x)) over x in X] seems quite difficult to pin down, as it's almost entirely dependent on the AI missing something. Perhaps you can do it for positions in some subset of x, lacking in critical sequences that the AI ignores? I got the impression that this is what you're interested in, but it seems very difficult to qualify.

Note that I still haven't talked about probability once, and at this point I don't think I should see a need for it. If you can show that my model is incomplete without probability, then please, show me. But, as far as I understand this talk of Bayesian logic and whatnot is confusing the concepts of AI 'winrate', which is mostly just an arbitrary number who's programmers at Google once gave bounds of 0 and 1, and actual player v. player winrates, which to me are entirely separate.

Bill Spight · **#33**

TelegraphGo wrote:

Frankly, I'm entirely unconvinced that the AI "winrates" and probability have anything to do with one other whatsoever. We have a lot of fuzzy talk about it in this thread, and since I'm confused I'd like to descend to the nitty-gritty. If it's not too dry, I'd like to try to define rigorously and without probability, what Bill is talking about, so that he or anyone else can show me where precisely we differ in opinion (if we do). Feel free to skip this post if you hate math

Just to let you know, I am not defining winrate as probability. That has already been done by others, and you can find it defined elsewhere in these discussions, which have been going on for a while, or someone in the know can define it here.

OC, it is possible just to regard it as a number between 0 and 1 that indicates a bot's evaluation of a position with a certain player to play, from the perspective of one player, P. If P's winrate is greater than 0.5, then the evaluation favors P, and if it is less than 0.5 it favors P's opponent. There is still the question of how good that evaluation is. In the Elf commentaries the Elf team does not report a choice of play by Elf which has fewer than 1500 playouts. Furthermore, when a human play has fewer than 500 playouts, they have it inherit its winrate from Elf's top reply to it.

Bill Spight · **#34**

TelegraphGo wrote:

Bill appears to suggest that in the range [0.5, 0.75] a reasonable value for M should be around 0.04 for at least one particular AI that he studied in depth.

I'm not quite so sure about that.

Taking the margin of error for Elf as around 4% is by guess and by golly. It may be educated guesswork, but it is guesswork, faute de mieux. Nobody has done the research on how good winrates are, while commentators utilize them uncritically.

Quote:

Talking about move-by-move 'margin of error' [more precisely, max(f'(g(x)) - f'(x)) over x in X] seems quite difficult to pin down, as it's almost entirely dependent on the AI missing something. Perhaps you can do it for positions in some subset

The very idea of winrate as probability depends upon AI missing something. OC, we know that it does.

Tryss · **#35**

TelegraphGo wrote:

Note that I still haven't talked about probability once, and at this point I don't think I should see a need for it. If you can show that my model is incomplete without probability, then please, show me. But, as far as I understand this talk of Bayesian logic and whatnot is confusing the concepts of AI 'winrate', which is mostly just an arbitrary number who's programmers at Google once gave bounds of 0 and 1, and actual player v. player winrates, which to me are entirely separate.

The key is that your function f is an interpolation of the training data.

TelegraphGo · **#36**

Quote:

The key is that your function f is an interpolation of the training data.

True, f is technically nondeterministic, but I think it shouldn't severely weaken the AI to assign an arbitrary deterministic choice to positions where the AI might come up with mildly different results on different evaluations. The AI seems to me to be relatively consistent on repeated analysis (at least compared to the magnitude of the errors it typically makes). If I don't misunderstand, shouldn't this make the f completely well-defined so that this isn't really a problem?

Quote:

The very idea of winrate as probability depends upon AI missing something. OC, we know that it does.

The AI certainly does miss things, and I view that there's two 'classes' of it missing something - 1. analyzing deeply but slightly inaccurately and 2. failing to analyze deeply. The problem with move-by-move margin of error is that you could take a position where the AI is about to miss something of the second class. When the AI makes a class 2 oops, it could lose almost any amount of %, which is hard to describe. On the other hand, class 1 misses are much more likely to have some useful upper bound. I tried to analyze so that I was only accounting class 1 misses with my definitions, because it's hard to pinpoint the boardstates where the AI is about to make a class 2 miss. Maybe with more thought I could rigorously define this as well, though.

Quote:

Just to let you know, I am not defining winrate as probability.

I had a feeling you weren't, but I wasn't quite sure exactly what you were going for. I'm mostly just trying to get some kind of agreed groundwork, so we can get to the fun stuff.

Quote:

Taking the margin of error for Elf as around 4% is by guess and by golly. It may be educated guesswork, but it is guesswork, faute de mieux. Nobody has done the research on how good winrates are, while commentators utilize them uncritically.

I wouldn't expect anything more than guesswork here, since it's probably almost as hard to figure an exact number as it is to completely solve the game of Go itself. But where particularly can we apply the number? Is it one of the M that I suggested, or perhaps a different definition of M? What do you think the most useful kind of M is with regard to effective human-AI analysis?

Bill Spight · **#37**

TelegraphGo wrote:

Bill Spight wrote:

Just to let you know, I am not defining winrate as probability.

I had a feeling you weren't but I wasn't quite sure exactly what you were going for. I'm mostly just trying to get some kind of agreed groundwork, so we can get to the fun stuff.

It's not that I don't regard it as a probability, but I rely upon the definition of others.

Quote:

Taking the margin of error for Elf as around 4% is by guess and by golly. It may be educated guesswork, but it is guesswork, faute de mieux. Nobody has done the research on how good winrates are, while commentators utilize them uncritically.

I wouldn't expect anything more than guesswork here, since it's probably almost as hard to figure an exact number as it is to completely solve the game of Go itself. But where particularly can we apply the number? Is it one of the M that I suggested, or perhaps a different definition of M? What do you think the most useful kind of M is with regard to effective human-AI analysis?[/quote]

My main concern is with comparing candidate plays. Theoretically, I would like to see how the number of playouts affects the margins of error for winrate estimates.

Tryss · **#38**

TelegraphGo wrote:

Quote:

The key is that your function f is an interpolation of the training data.

True, f is technically nondeterministic, but I think it shouldn't severely weaken the AI to assign an arbitrary deterministic choice to positions where the AI might come up with mildly different results on different evaluations. The AI seems to me to be relatively consistent on repeated analysis (at least compared to the magnitude of the errors it typically makes). If I don't misunderstand, shouldn't this make the f completely well-defined so that this isn't really a problem?

f:X->[0,1] is totally deterministic. That's the "0 playout" evaluation.

What is non-deterministic is the winrate calculated from more than 1 playout : it's computed from values f(x'), where x' are positions further in the tree, the randomness come from what positions x' you use. But it's still based on the interpolation of your training data, but instead of just the current position, you use it on possible future positions.

Lightvector could tell us what Katago winrate are an interpolation of, and how they are computed (only the node at the end of the best branch? something else ?)

jann · **#39**

Tryss wrote:

f:X->[0,1] is totally deterministic. That's the "0 playout" evaluation.

This depends on the random choice of "which rotation the position was fed with" (which is actually the most significant random factor for search too).

Find the best move (AI will be no help)

Who is online