AlphaGo selfplay

aeb · Post by **aeb** » Sat May 27, 2017 5:37 am

DeepMind announced that they would publish 50 more selfplay games by AlphaGo, and published 10 today.
They can be found on http://homepages.cwi.nl/~aeb/go/games/games/AlphaGo/ .

Mef · Post by **Mef** » Sat May 27, 2017 8:59 am

White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays.

alphaville · Post by **alphaville** » Sat May 27, 2017 3:51 pm

Mef wrote:White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays.

If changing it by one point, it may just flip similarly in black's favour. Deepmind said that 7.5 with Chinese rule is as good as it gets.

Kirby · Post by **Kirby** » Sat May 27, 2017 8:54 pm

alphaville wrote:Deepmind said that 7.5 with Chinese rule is as good as it gets.

It's not clear to me in my head that Deepmind would really know the appropriate komi (for certain). Changing the komi could have significant impact on game winning probabilities from a given board position, and could result in an altered strategy. So all of the training that's been happening through self-play might make the most sense having 7.5 komi.

It could very well be that 7.5 komi is correct, but I would think that this should be investigated more scientifically if we really want an answer to this question.

Mef · Post by **Mef** » Sat May 27, 2017 9:18 pm

alphaville wrote:
Mef wrote:White wins 8 of 10, I wonder if that suggests that komi should be a little lower. Though I suppose that it could also be an idiosyncrasy in how AlphaGo plays.
If changing it by one point, it may just flip similarly in black's favour. Deepmind said that 7.5 with Chinese rule is as good as it gets.

This is what more or less what I was referring to about idiosyncrasies of AlphaGo....for top pros, going between 5.5 and 7.5 might mean that the average winning percentages shift slightly toward white (and we aim to get it as close to 50% as we can).....For AlphaGo (who would be extremely consistent in play) it might be that anything other than a 100% winrate suggests there is reasonable parity.

Bill Spight · Post by **Bill Spight** » Sat May 27, 2017 9:21 pm

Kirby wrote:
alphaville wrote:Deepmind said that 7.5 with Chinese rule is as good as it gets.
It's not clear to me in my head that Deepmind would really know the appropriate komi (for certain). Changing the komi could have significant impact on game winning probabilities from a given board position, and could result in an altered strategy. So all of the training that's been happening through self-play might make the most sense having 7.5 komi.

It could very well be that 7.5 komi is correct, but I would think that this should be investigated more scientifically if we really want an answer to this question.

Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis.

But I doubt if the DeepMind team tried territory rules -- IIUC, even Zen does not use Japanese rules for training. But did they try Button Go ( See http://senseis.xmp.net/?ButtonGo ) with a 7 pt. komi and a 1/2 pt. button? I doubt it. And since button go scores, like those of territory scores, normally have 1 pt. differences instead of 2 pt. differences, they might well find a komi that yields winning odds closer to 50:50 than Chinese scoring with 7.5 komi.

luigi · Post by **luigi** » Sat May 27, 2017 9:23 pm

One of the commentators in the Ke Jie match said that, in self-play, AlphaGo won only 45% of the time with Black, which is part of the reason why Ke Jie asked to be White in the last game.

I think Chinese rules should seriously consider using 7 komi together with the button to prevent ties. The winning probability should be the same that way as it is in Japanese rules with 6.5 komi.

EDIT: Heh, of course Bill Spight beat me to it.

Kirby · Post by **Kirby** » Sat May 27, 2017 11:07 pm

Bill Spight wrote: Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis.

Let's call the version of AlphaGo trained with komi of 7.5 AlphaGoX. Then I agree that it's likely that AlphaGoX vs. AlphaGoX probably has closest to 50:50 win rate using komi of 7.5 points.

But would a different type of AlphaGo that plays different moves have developed if it were trained with komi of, say, 10.5? The value network would have developed differently, probably. Let's call that hypothetical program AlphaGoY.

So an experiment where you put AlphaGoY vs. AlphaGoY might end up having games closest to 50:50 with a different komi than 7.5. Because AlphaGoY plays different types of moves than AlphaGoX... Isn't that possible?

Bill Spight · Post by **Bill Spight** » Sun May 28, 2017 6:26 am

Kirby wrote:
Bill Spight wrote: Oh, I am not surprised that AlphaGo vs. AlphaGo games are closest to 50:50 with a komi of 7.5 under Chinese rules. I don't think that Silver would make that claim without having tried different komis.
Let's call the version of AlphaGo trained with komi of 7.5 AlphaGoX. Then I agree that it's likely that AlphaGoX vs. AlphaGoX probably has closest to 50:50 win rate using komi of 7.5 points.

But would a different type of AlphaGo that plays different moves have developed if it were trained with komi of, say, 10.5? The value network would have developed differently, probably. Let's call that hypothetical program AlphaGoY.

So an experiment where you put AlphaGoY vs. AlphaGoY might end up having games closest to 50:50 with a different komi than 7.5. Because AlphaGoY plays different types of moves than AlphaGoX... Isn't that possible?

If I understand you correctly, don't we have the example of the development of komi in go history? Up until the mid-20th century, players trained on no komi games. You can see the difference in early go strategy. Black tended to play conservatively, while White played enterprisingly, to try to catch up. So

was typically a kakari, and Black typically played first in three corners. According to go theory at that time, that gave a theoretical advantage to Black, but White felt the need to complicate the game. With the advent of komi we saw the rise in popularity of parallel fuseki. On the assumption that the first four moves should be in an open corner, it is easy to show that a parallel fuseki is correct (even if a diagonal fuseki is, also), because each player can guarantee a parallel fuseki. Not that the early White kakari disappeared. Even Go Seigen recommended it in certain situations in his 21st century go writings.

The 4.5 komi soon proved to be too small. It took a long time, but the Japanese finally adopted a 6.5 komi, after decades of playing with a 5.5 komi. (Even in the 1970s results with both a 4.5 komi and a 5.5 komi suggested a 6.5 komi, as an article in the AGA Journal showed.) Ing adopted a 7.5 komi by the early '80s. For some time there was a question whether even the 7.5 komi was enough. (Practical komi tends to increase with the strength of the players, up to the theoretical komi.)

How much difference does 2 points make? Apparently not much. Despite being trained to a 4.5 komi, the median results of Japanese pros tended to a 1.5 - 2.5 win for Black. With the change to 5.5 komi, that became a 0.5 - 1.5 win for Black. In the time since the rise of the parallel fuseki, has there been any strategic change in play because of changing komi? Even with the higher komi, Go Seigen felt that White should make the game difficult for Black. Maybe he was an old man living in the past, but pros still valued his insights and advice.

Now, along comes AlphaGo, the strongest go player yet. It trained on a 7.5 komi, but would its practical results in self play suggest a komi of 9.5, even as the practical results of pros with a komi of 4.5 suggested a komi of 6.5? Why not, if the theoretical komi is greater than 7.5? (Komi by Chinese rules tends to shift by 2 point increments.) No, White has the advantage in AlphaGo vs. AlphaGo games with 7.5 komi, which suggests, if anything, that a 5.5 komi might be better.

Did the DeepMind team train a version of AlphaGo on a 5.5 komi? Maybe, but I kind of doubt it. Why bother? But I feel sure that they would not make any comments about komi unless they had millions of AlphaGo self-play games with a 5.5 komi. Is AlphaGo so brittle that training on a 7.5 komi would lead to relatively poor play at a 5.5 komi? I doubt it. Human pros were not so brittle with a 4.5 komi. They could have jumped to a 6.5 komi easily in the 1970s, just as they jumped to a 7.5 komi in the 1980s when they played by Ing rules.

Did AlphaGo, as White, find some new strategies to make the game more difficult for Black? I suspect so. Anyway, the main advance of AlphaGo over current pros seems to be in the realm of strategy. Much food for thought.

aeb · Post by **aeb** » Sun May 28, 2017 7:02 am

DeepMind announced that they would publish 50 selfplay games by AlphaGo, and published the second batch of 10 today.
They can be found on http://homepages.cwi.nl/~aeb/go/games/games/AlphaGo/ both as tar-file and as separate games.

pookpooi · Post by **pookpooi** » Sun May 28, 2017 7:19 am

The other thread is so full of game record it's hard to keep conversation.

I count black win 12 out of 50 games
Only 24%
If these games are not handpicked then.... Well, it's human game anyway, to change komi has to be done by human. But seems like many pro are think the same as AlphaGo on this matter.

aeb · Post by **aeb** » Sun May 28, 2017 7:26 am

DeepMind announced that they would publish 50 selfplay games by AlphaGo, and in fact published the remaining 40 today.
They can be found on http://homepages.cwi.nl/~aeb/go/games/games/AlphaGo/ both as tar-file and as separate games.

(The situation was messy. Their webpage did not work under Firefox or Chrome on Linux or MacOs. Wget at first worked, and later got "403 Forbidden", and then worked again. Looks like the DeepMind people struggled to make this work as intended. Maybe got flooded with complaints and released all?)

Kirby · Post by **Kirby** » Sun May 28, 2017 8:02 am

Bill,
I think 7.5 komi may very well be correct komi - just want to point out that testing with current version of AlphaGo is not necessarily rigorous.

For example, maybe AlphaGo trained with 10.5 komi plays very aggressively as black to overcome the point difference. You come up with a different program, so it remains possible that black wins more often against itself with this aggressive strategy.

Seems unlikely to me, but I just feel a single version of AlphaGo may not be qualified to generally prove correct komi. A more rigorous test might be to train different versions of AlphaGo each optimized for different Komi values, and see which version had closest to 50% winrate when playing against itself. Even then, it's unclear how long to train each version to make a fair experiment.

Or maybe I just don't see it in my head :-p

mistakenot · Post by **mistakenot** » Mon May 29, 2017 7:24 am

Spreadsheet with basic stats for the 50 games: https://docs.google.com/spreadsheets/d/ ... plIGo/view

Some trivial observations:

The longest game was #33 (346 moves).
The shortest game was #12 (180 moves).
Game #2 (where B287 captured 29 white stones) had the most white stones captured by black (51), but white won the game. It also had the most prisoners total (80), resulting in the fact that over 25% of the stones played in that game were captured by the end.
Game #10 (where W244 captured 22 black stones) mirrored #2 with an equivalent difference in magnitude between black and white prisoners but favoring white, yet black won the game.
In contrast, games #13 and #40 each saw only 4 stones captured. Game #13 was relatively short (182 moves) but #40 was moderately long (266 moves) which meant it had the lowest fraction of captured stones among the set (1.5%).
The game with the most stones on the board at the end was #5 (283 stones, after 307 moves and 24 captures).

Also, if needed here's another set of mirrors for the self-play games (ZIP archive) as well as links to download the SGFs directly from DeepMind: https://www.reddit.com/r/baduk/comments ... e/di5nwtl/

Incidentally, I noticed that most of DeepMind's SGFs were created with CGoban 3, except some from the last batch (#42, 44, and 46-50) which were created with GoGui 1.4.9. Doesn't make much difference, except the SGFs created by GoGui are slightly more compact than SGFs of similar length created by CGoban (because CGoban puts each move on a separate line).

Bill Spight · Post by **Bill Spight** » Mon May 29, 2017 8:32 am

Kirby wrote:Bill,
I think 7.5 komi may very well be correct komi - just want to point out that testing with current version of AlphaGo is not necessarily rigorous.

For example, maybe AlphaGo trained with 10.5 komi plays very aggressively as black to overcome the point difference. You come up with a different program, so it remains possible that black wins more often against itself with this aggressive strategy.

Can't you apply that argument to players of yore who made overplays as White to overcome the lack of komi?

Seems unlikely to me, but I just feel a single version of AlphaGo may not be qualified to generally prove correct komi.

I don't think that we can prove correct komi.

Life In 19x19

AlphaGo selfplay

AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay

Re: AlphaGo selfplay