AlphaZero paper published in journal Science

Bill Spight · Post by **Bill Spight** » Sat Dec 08, 2018 8:54 am

dfan wrote:..."I am willing for my chance of winning to go down from 98% to 97% in return for winning by 10.5 points instead of 0.5". Due to the nature of the playing system, there's no good way to say "I have a 100% chance of winning, and now I want to maximize my score while retaining that 100% chance", although of course that statement is logically meaningful.

ez4u wrote:
Bill Spight wrote:
ez4u wrote: The statements may be logically meaningful but they are trivial. Isn't the real challenge to make sense of a statement like, "I have a 51% chance of winning by 0.5 points by playing X and a 49% chance of winning by 1.5 points by playing Y. I want to maximize my score; which should I choose?"
The thing is, amateur dans play the late endgame almost perfectly; but even pros do not play the late endgame perfectly. Under those circumstances, if it's a close call in the late endgame between going for a ½ pt. win versus going for a 1½ pt. win, the extra point gives a margin of safety. At least for humans.

But most, if not all, modern top bots do not assume nearly perfect play when they calculate winrates. And they do not estimate the margin of safety by expected scores, but by percentages. As far as I can tell, the endgame, particularly the late endgame, is one of the places where humans play better than bots; life and death, semeai, and ladders being others. In all of these places, local reading can give the right global results. Bots excel at global reading, humans still excel at local reading.
If the discussion is about switching from a winrate strategy to a maximum point strategy, then the starting point is the fuseki not the late endgame.

To take your example, in general, in the opening the difference between estimated winrates of 51% and 49% is more indicative of the chances of winning than the difference between estimated results of 1½ pts. and ½ pt. But in the late endgame I think that the difference between estimated results (by current human pros) of 1½ pts. and ½ pt. is more indicative of the chances of winning than the difference between estimated winrates (by current top bots) of 51% and 49%. The reason lies in the reduction of the uncertainty in estimated point scores as the game goes on. Currently the uncertainty of estimated point scores is so great in the opening that no pros that I know of even attempt to estimate them. (The traditional approach is to estimate locally secure territory and to use that as one factor to consider.)

seberle · Post by **seberle** » Sat Dec 08, 2018 1:19 pm

I just finished reading the AlphaZero paper, which was fascinating. I have a couple of questions, if anyone happens to know more.

On page 2, they explain that each move is selected "either proportionally (for exploration) or greedily (for exploitation) with respect to the visit counts at the root state." What does that mean in layman's terms?

It's interesting that they abandoned symmetry because chess and shogi don't have symmetric boards. I wonder if AlphaZero has any idiosyncrasies, such as preferring a certain joseki in one corner, but a different variation in another corner. Did anyone read the supplemental material? Do they mention anything like this?

dfan · Post by **dfan** » Sat Dec 08, 2018 3:55 pm

seberle wrote: On page 2, they explain that each move is selected "either proportionally (for exploration) or greedily (for exploitation) with respect to the visit counts at the root state." What does that mean in layman's terms?

Say that when deciding on its next move, it has considered 500 variations starting with move A, 300 variations starting with move B, and 200 starting with move A. (In general, it tries to look more at moves that look more promising, for obvious reasons.)

In the proportional case (this is "temperature = 1", if you see it elsewhere), it would pick move A with 50% probability, move B with 30% probability, and move C with 20% probability, proportionally to their visit counts. This emphasizes exploration, and is done early in self-play games to generate a varied data set and make sure it tries lots of ideas and doesn't get stuck in its learning.

In the greedy case (this is "temperature = 0"), it would pick move A all of the time. This emphasizes exploitation, and is what you do in competition when you want to play your best.

seberle · Post by **seberle** » Sat Dec 08, 2018 10:33 pm

Uberdude wrote:I wouldn't try to convert those Elo differences to handicap, it's like converting apples to volts. To take the example of LeelaZero vs Haylee a while ago (a bit weaker than Fan Hui I suppose), it absolutely demolished her on even and 2 stones, in a manner that if a human (e.g. Lee Sedol) did that I'd expect her to lose on 3 stones too, but she won easily on 3 with LZ going silly.

Two questions for anybody:

So what do people say is the proper handicap between top pros and perfect play? I remember before AlphaGo I read that some pros thought that the top players would need no more than a 4 stone handicap against "God". Is that still what some think?

If Elo can't be converted to handicap at these high ranks, how do you determine handicap from Elo? Or can you? At what rank does the rule "100 Elo points = 1 rank" begin to break down?

moha · Post by **moha** » Sun Dec 09, 2018 4:42 am

seberle wrote:So what do people say is the proper handicap between top pros and perfect play? I remember before AlphaGo I read that some pros thought that the top players would need no more than a 4 stone handicap against "God". Is that still what some think?

There may be a stone or two uncertainity here, but it seems obvious that >3 and <9 stones are necessary. It is just hard to imagine a top pro losing at 9 stones, the board is simply not big and the game not long enough.

If Elo can't be converted to handicap at these high ranks, how do you determine handicap from Elo? Or can you? At what rank does the rule "100 Elo points = 1 rank" begin to break down?

You may look at W's avg winrate at each strength level to get an idea. Since we can guess that fair komi is 7, the significance of the half point (slightly more with imperfect play) advantage also hints about the significance of one handicap stone at that level. It worths a few % at amateur levels, few % more at top pro levels, even more at top bot levels, and 100% at perfect level.

lightvector · Post by **lightvector** » Sun Dec 09, 2018 3:36 pm

seberle wrote: If Elo can't be converted to handicap at these high ranks, how do you determine handicap from Elo? Or can you? At what rank does the rule "100 Elo points = 1 rank" begin to break down?

If you're equating ranks with stones, I'd say it breaks down all over the place, since 100 Elo points = 1 rank is not so great a rule of thumb to begin with. You might be confused due to the fact certain rating models used by various Go organizations or servers that alter the very definition of what an "Elo point" is to try to make it so that under those systems 100 "Elo points" = 1 rank by definition. But of course those altered "Elo points" have little to do with the traditional Elo points that presumably you're asking about, i.e. the ones that underlie FIDE chess ratings, goratings.org, BayesElo, WHR, and the ones that academic publications will usually use when reporting ratings differences. With traditional Elo points, a fixed Elo difference corresponds to a fixed modeled winning chance rather than a fixed rank difference, scaled so that 400 Elo ~= 10:1 winning odds.

The correspondence between traditional Elo differences (i.e. winning chance) and rank difference is not simply a fixed ratio and it becomes highly nonlinear once you get into even amateur dan ranks, much less pro level or beyond. If you're interested in some actual data, I know there are some studies on OGS and/or KGS out there that have been done, or if you like, here's some old real data from EGF tournaments: http://gemma.ujf.cas.cz/~cieply/GO/statev.html

That data is just among humans of course. If you want to add bots into the mix, any computer chess programmer will tell you that Elo differences between bots (particularly ones measured with self-play) don't necessarily translate into the same Elo differences against humans, and the same appears to be true for Go. And in Go it appears that without clever tricks like "dynamic komi" (or even with such tricks), strong Go bots also scale quite differently than humans in handicap games versus even games.

Hope that helps. Basically rating and rankings are a pretty complex mess and you can't really boil it down into any simple rule.

seberle · Post by **seberle** » Sun Dec 09, 2018 9:47 pm

lightvector wrote:
seberle wrote: 100 Elo points = 1 rank is not so great a rule of thumb to begin with.

Thanks, that was very helpful.

Ok, see if I'm understanding better. The EGF rating system, for example, has modified the Elo system so as to force 100 rating points to be equivalent to one rank. If the table here (https://senseis.xmp.net/?EGFRatingSystem) is any indication, it looks like the EGF system wanders far from the Elo win rate of about 36% for one rank difference in the SDK ranks, but is reasonably close for DDK. Am I interpreting this correctly?

jlt · Post by **jlt** » Mon Dec 10, 2018 3:07 am

My understanding is the same, but I think that winning percentages are calculated from a theoretical formula, and not really observed. To get observed winning percentages, go to the website http://www.europeangodatabase.eu/EGD/winning_stats.php

Between 2003 and 2018, we get the table

Code: Select all

                    Winning Statistics - Even Games
				  
           G + 1               G + 2               G + 3                G + 4       

 G     Nw    Ng    Pw      Nw    Ng    Pw      Nw    Ng    Pw      Nw    Ng    Pw  
--- ------------------- ------------------- ------------------- -------------------  
20k   2787  9258   30.1   1737  6788   25.6    689  4170   16.5   370  3099   11.9
19k   1304  3515   37.1    617  2238   27.6    213  1170   18.2   140   879   15.9
18k   1640  4016   40.8    898  2725   33.0    475  1738   27.3   138   816   16.9
17k   1690  4020   42.0   1209  3202   37.8    369  1384   26.7   151   915   16.5
16k   2030  4774   42.5    957  2811   34.0    366  1457   25.1   188  1005   18.7
15k   2055  4883   42.1   1184  3365   35.2    576  1861   31.0   213  1104   19.3
14k   1791  4135   43.3   1089  2943   37.0    406  1342   30.3   228   931   24.5
13k   2127  4798   44.3   1098  2814   39.0    538  1671   32.2   220   949   23.2
12k   2107  4723   44.6   1671  4018   41.6    510  1638   31.1   203   993   20.4
11k   2500  5554   45.0   1285  3282   39.2    519  1554   33.4   193   819   23.6
10k   3125  7159   43.7   1926  4899   39.3    596  1841   32.4   280  1162   24.1
 9k   3460  7888   43.9   1798  4535   39.6    585  1873   31.2   251  1165   21.5
 8k   3809  8743   43.6   2070  5406   38.3    631  2143   29.4   217  1100   19.7
 7k   4052  9243   43.8   2211  5554   39.8    592  1928   30.7   211  1069   19.7
 6k   4897 11087   44.2   2136  5673   37.7    637  2139   29.8   217  1193   18.2
 5k   5025 11599   43.3   2348  6306   37.2    665  2389   27.8   272  1527   17.8
 4k   4856 11130   43.6   2154  6101   35.3    703  2606   27.0   288  1756   16.4
 3k   4777 11253   42.5   2209  6818   32.4    657  2951   22.3   173  1503   11.5
 2k   5158 12825   40.2   2440  7389   33.0    547  2841   19.3   175  1721   10.2
 1k   6483 15979   40.6   2107  7541   27.9    657  4043   16.3   154  2107    7.3
 1d   5346 14674   36.4   2263  9219   24.5    580  4721   12.3   144  2275    6.3
 2d   4715 12578   37.5   1613  7282   22.2    437  3617   12.1    47  1314    3.6
 3d   3690 10912   33.8   1392  6860   20.3    200  2844    7.0    12   577    2.1
 4d   2921  8855   33.0    666  4641   14.4     35   985    3.6                   
 5d   1723  6071   28.4    174  1704   10.2                                       
 6d    477  2193   21.8

where the grades are the "declared grades". Assuming that declared grades reflect real strength accurately, we can see that

At the 15k rank, 100 EGF points = 57 real Elo
At the 5-10k ranks, 100 EGF points = 50 real Elo
At the 2k rank, 100 EGF points = 72 real Elo
At the 1d rank, 100 EGF points = 102 real Elo
At the 3d rank, 100 EGF points = 117 real Elo
At the 6d rank, 100 EGF points = 220 real Elo

It seems however difficult to convert Elo points into handicap stones. We can read on the same website

Code: Select all

             Statistics of Handicap Games - strong side dan (wins for weak side)
			 
 Gr.         H 1              H 2              H 3              H 4              H 5              H 6              H 7              H 8              H 9     
Diff  Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  % 
---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---
  1    382   919  42                       1     1 100                                                                                                        
  2    214   731  29    292   783  37                                                                                                                         
  3     91   477  19    146   482  30    255   631  40                                                                                                        
  4     16   142  11     68   324  21    119   376  32    209   524  40                                                                                       
  5      6    70   9     13    82  16     66   252  26     93   297  31    195   519  38                                                                      
  6      3    31  10      4    28  14     10    66  15     62   223  28     67   200  34    168   388  43                                                     
  7      0    11   0      0    12   0      4    31  13     13    68  19     46   172  27     43   150  29    111   282  39                       1     1 100  
  8      0     2   0      1     3  33      3    12  25      3    25  12      5    42  12     43   143  30     55   136  40    114   278  41                   
  9      0     1   0      0     1   0      1    10  10      2    16  13      1    21   5     10    43  23     28   110  25     41    95  43    117   262  45

The table says for instance that a 3d wins 19% of his H1 games against a 4d, which is very strange since he wins 33.8% of his even games against a 4d. So maybe there is a bias (people choose to play handicap games when they think that their real strength difference is larger than their official rank difference), so it's not easy to determine how many stones represent a difference of 1 EGF rank.

seberle · Post by **seberle** » Mon Dec 10, 2018 4:48 am

jlt wrote: It seems however difficult to convert Elo points into handicap stones. We can read on the same website

Code: Select all

             Statistics of Handicap Games - strong side dan (wins for weak side)
			 
 Gr.         H 1              H 2              H 3              H 4              H 5              H 6              H 7              H 8              H 9     
Diff  Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  % 
---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---
  1    382   919  42                       1     1 100                                                                                                        
  2    214   731  29    292   783  37                                                                                                                         
  3     91   477  19    146   482  30    255   631  40                                                                                                        
  4     16   142  11     68   324  21    119   376  32    209   524  40                                                                                       
  5      6    70   9     13    82  16     66   252  26     93   297  31    195   519  38                                                                      
  6      3    31  10      4    28  14     10    66  15     62   223  28     67   200  34    168   388  43                                                     
  7      0    11   0      0    12   0      4    31  13     13    68  19     46   172  27     43   150  29    111   282  39                       1     1 100  
  8      0     2   0      1     3  33      3    12  25      3    25  12      5    42  12     43   143  30     55   136  40    114   278  41                   
  9      0     1   0      0     1   0      1    10  10      2    16  13      1    21   5     10    43  23     28   110  25     41    95  43    117   262  45

The table says for instance that a 3d wins 19% of his H1 games against a 4d, which is very strange since he wins 33.8% of his even games against a 4d. So maybe there is a bias (people choose to play handicap games when they think that their real strength difference is larger than their official rank difference), so it's not easy to determine how many stones represent a difference of 1 EGF rank.

Are you sure that "The table says for instance that a 3d wins 19% of his H1 games against a 4d"? I'm new to this, but I thought the table was saying the weaker player (any rank) wins 19% of their games against a player 3 ranks stronger when given a handicap of one stone.

I'm not surprised that handicap stones don't even things out smoothly since the first "handicap stone" is just komi, which is only half the value of the first move. Two handicap stones are actually only worth 1 1/2 ranks, and so forth. Or at least, that is what I have understood. Correct me if I've gotten this wrong!

jlt · Post by **jlt** » Mon Dec 10, 2018 5:06 am

Yes, I misread the table and you are right.

Anyway the statistics of handicap games are not precise enough, so I don't have enough data to determine how many stones is worth one EGF rank difference.

Bill Spight · Post by **Bill Spight** » Mon Dec 10, 2018 9:46 am

seberle wrote:
jlt wrote: It seems however difficult to convert Elo points into handicap stones.
{snip}
The table says for instance that a 3d wins 19% of his H1 games against a 4d, which is very strange since he wins 33.8% of his even games against a 4d. So maybe there is a bias (people choose to play handicap games when they think that their real strength difference is larger than their official rank difference), so it's not easy to determine how many stones represent a difference of 1 EGF rank.
I'm not surprised that handicap stones don't even things out smoothly since the first "handicap stone" is just komi, which is only half the value of the first move. Two handicap stones are actually only worth 1 1/2 ranks, and so forth. Or at least, that is what I have understood. Correct me if I've gotten this wrong!

Traditionally, rank differences were determined by handicap differences. In theory, one stone difference was equivalent to one rank difference. But handicap differences (at least for amateurs) gave an advantage to White, an advantage equivalent to komi (i.e., ½ stone). So a player two ranks stronger gave only a two stone handicap, with no komi, instead of giving three stones with Black giving komi or giving two stones with White giving komi.

Modern tournament ranks and online ranks are based upon even games, and do not necessarily tell us the proper handicap.

seberle · Post by **seberle** » Wed Dec 12, 2018 12:38 am

Traditionally, rank differences were determined by handicap differences. In theory, one stone difference was equivalent to one rank difference. But handicap differences (at least for amateurs) gave an advantage to White, an advantage equivalent to komi (i.e., ½ stone). So a player two ranks stronger gave only a two stone handicap, with no komi, instead of giving three stones with Black giving komi or giving two stones with White giving komi.

Modern tournament ranks and online ranks are based upon even games, and do not necessarily tell us the proper handicap.

This is interesting. First of all, how were handicap differences handled "traditionally" (do you mean before komi?). If we don't change komi, then what is the difference between a one-stone handicap and simply letting black go first? Or was going first considered being one rank stronger traditionally?

Secondly, does either system work out precisely (without doing fine adjustments to komi)? I mean, if a two-stone handicap (any system) means a 7k can play an even game against a 5k and a 5k can play an even game against a 3k, does it necessarily mean that a four-stone handicap for the 7k will get an even game against the 3k? I suppose this question is even more important for the one-stone, two-stone question: if one stone means one rank, does two stones really mean two ranks? I know I saw a debate on Sensei's Library about this once, but I didn't understand it very well and I don't remember exactly where I saw it.

John Fairbairn · Post by **John Fairbairn** » Wed Dec 12, 2018 4:01 am

This is interesting. First of all, how were handicap differences handled "traditionally" (do you mean before komi?). If we don't change komi, then what is the difference between a one-stone handicap and simply letting black go first? Or was going first considered being one rank stronger traditionally?

To get your head round this you need to understand that the ranks we now use are a relatively modern construct. Traditionally (Edo times) ranks were limited to pro-level dans. They didn't even use komi. Honinbo Shuho introduced some lower grades for amateurs but his system was soon abandoned (for political reasons) and players reverted to the old dan-only/pro-only system, essentially until the democratisation after World War II. The amateurs started using their own dan scale then, and the first amateur 6-dan was Hirata Hironori in 1955 (for winning the 1st Amateur Honinbo - the prize nowadays is 8-dan).

Since then amateurs in Japan were able to use kyus - and did, but the lower ranks have been used with much more gusto in the west. Indeed, a number-only system was introduced by amateurs in Germany even before the war, and was either used or copied by other western amateur associations. We have seen western amateurs - very many with a mathematical background like those early Germans - obsessively try to apply rules and numbers to many aspects of go.

But handicaps existed well before ranking systems and so it follows they can have no real correlation. They were used to a very limited extent between pros but mostly were (and still are, in Japan) nothing more than a teaching tool. No doubt for that pedagogic reason, too, the stone placings were fixed - the idea of free handicap placement is another modern idea, inspired first by mathematical amateurs in Japan (and even giving rise to a book on them by a pro!). The use of these for rankings, and of komi, likewise has no tradition (or even theory) behind it.

The use of komi (mainly in trying to determine what an even game means - and that's varied a lot in the last 100 years) is likewise mainly an Japanese amateur idea, from 1751. Pros tried it a few times from the early 19th century, starting at 5 points and gradually reducing over the decades until it even reached 2 points. It only started to rise after World War II.

So you can see that trying to tie ranks to handicaps is like climbing up a greasy pole. Historical grades and handicaps both differed for pros and amateurs from modern ones, modern grades and handicaps differ between amateurs and pros (and by country). Komi has been messed up for 300 years. The philosophical drive behind rankings differs in the west and the Far East. People have different ideas on how to implement handicaps. Etc, etc.

Life is too short to worry about such things. Of course we all would like a way of quantifying how much stronger A is than B, but it seems sensible to accept it's always going to be a wild guesstimate - at least until we get a DeepRatingsZeroPixie algorithm.

jlt · Post by **jlt** » Wed Dec 12, 2018 4:34 am

Statistics on handicap games on the EGD website lack precision but seem to show that handicaps are approximately additive (if a rank A is 2 stones stronger than B, which is 3 stones stronger than C, then A is 5 stones stronger than C). However some players are particularly good at playing with or against handicap, and vice-versa, so the proper handicap between two given people cannot be predicted by subtracting their ranks. Bots are an extreme example (very bad at handicap go).

moha · Post by **moha** » Wed Dec 12, 2018 7:06 am

It is true that using the number of (extra) black stones as handicap - without black giving komi - is mathematically incorrect. It also makes it harder to maintain a good "feeling" about who is ahead (since in some games W gets komi which must be factored into the human intuition, but in other games he gets none - same problem as for a value net). So the same board position can be roughly even AND W significantly ahead depending on this.

But there are problems even without the doubtful human systems. If player A wins 50% against player B even after passing his first move, he is clearly one stone stronger. His winrate against B in even games cannot be correctly guessed from this, as that also depends on how close they are to perfect play.

IMO there are two ways to define one rank class: a certain (like 70%) winrate in even games (ELO-like), OR 50% winrate in N-stone games. For the latter approach, the scale ends with perfect play being a few stones above top pro level. For the former, the scale continues to amplify smaller and smaller strength differences and has many more steps after top pros (but is probably still bounded somewhere).

Life In 19x19

AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science

Re: AlphaZero paper published in journal Science