Three months ago Ohashi Hirofumi started a mini-series in
Go World as part of its 800th issue celebrations. Using data from the Chinese AI program Golaxy, he looked at three famous games, by Jowa, Shusaku and Dosaku respectively. At the time they appeared I just skimmed over them. I have barely looked at AI games and wanted to read a primer before I looked at these historical games closely.
I was disappointed in not getting my desired primer on last month's trip to Tokyo, and was disappointed overall with the book haul (but amply compensated for by joining the Tokyo branch of the RSCDS in their monthly Scottish country dancing class and a visit with my grandson to the Tamiya Factory for him to pick up some Russian tanks).
I did buy several AI books but most were pot boilers, and the only one I really rated as worth reading on the plane was a book by Ohashi himself. He seems to be the most knowledgeable pro interested in AI and has all the best contacts in China and Korea. (He also writes well.)
But when I came back I decided finally to delve into the historical series without trying to stuff myself full of background. That is not to say that my mind was tabula rasa. I have been looking at some old games and books by comparing their comments with Lizzie. It's a curate's egg mess with books: the only one I found that seemed to score consistently well was Kimu Sujun's recent book on the Four Basic Rules for Surrounding Territory Efficiently (novel stuff: perhaps influenced by AI study?). Most books score more like 50:50 or 60:40. The celebrated Katsugo Shinpyo is more like 0%, although I expect that must have a lot to do with Lizzie evaluating whole-board positions whereas KS is about local positions.
Commentaries and actual games, however, seem to score well on the whole. Even where a pro makes a mistake, a human commentator has usually spotted it before the bot, and more often than not the pro move is close to the top few selected by the bot. The game seems to unravel mainly because of just one or two big mistakes. The other thing I have noticed with old games is that the nominally stronger player generally scores better on the AI scale than the weaker one.
The only game in Ohashi's series I have looked at so far is the Three Brilliancies game between Jowa and Akaboshi Intetsu (this is also the subject of my Slate & Shell book
Brilliance).
The most significant point to cover first is the komi. AI bots generally are trained on 7.5 points komi and this badly affects the reliability of their assessments of no-komi games. I personally don't realise why but the Elf team, when they were adding AI commentary to all the GoGoD games, told me this was a major point but results in the early fuseki are probably not too badly affected.
However, the reason Golaxy was used in Ohashi's series is that it can cope with different komis. Other plus points are that it is probably stronger even than AlphaGo and that it can give its evaluations not just as winning percentages but as point scores. It has concluded that on an empty board with no komi Black wins on average by 6.1 points.
For the evaluations done on the historical games, the machine was run for an entire day, looking at 5 million nodes per move. Reading out the results of that data-crunching was no easy task either.
The overall picture was that Jowa did not make any serious booboos. Intetsu made a couple, but most moves that were not rated best by the computer were close to the best or could be adjudged either simply slack moves or deliberately risky moves - in both cases (as Ohashi takes pains to demonstrate) based on positional evaluations and explainable by psychology. That does not mean the human evaluations were correct, but they were at least rational.
The three briliiancies were not quite the best moves according to the bot, but were not at all bad, and Ohashi argued (and demonstrated convincingly, I thought) that they should still be regarded as myosu. He said that a myosu is a brilliant move
that is hard to see and that is the point. If it was hard for Jowa to see them it was also hard for Intetsu. The best move according to AI for Jowa's first brilliancy was an easy-to-see invasion, but it left him behind overall, as Jowa must have realised. Black's winning ratio at that point was 64% or 3 points. After waving his Harry Potter wand, he went back to the invasion but now he was level pegging! Intetsu had failed to find the best replies.
(Incidentally there was a case a little later where Black's winning ratio was 61.1% but territorially White was ahead by 0.2 points. Ohashi admits this is hard to understand.)
It has already been pointed out on this forum that there are several cases where a bot does not even list a particular move in its top N moves, but when that move is actually played the win ratio barely changes. The same thing seems to happen with Golaxy. In fact, it often didn't "see" a move preferred by Lizzie (and vice versa, of course). Ohashi does not describe every move, so it's hard to compare Golaxy and Lizzie but my impression was that Lizzie preferred the same sort of moves as Golaxy but on a couple of crucial occasions totally missed a killer move inside a variation spotted by Golaxy.
Going back to the slack/risky moves point, Ohashi several times made the point that Intetsu (with the advantage of Black, of course) made moves that he must have realised were slack but were safe, as he clearly judged he was ahead (and he was - but he was whittling down his own lead). Ohashi didn't make the point but I noticed that the far fewer slack moves Jowa made came at points when Intetsu had just made a slack move, as if he relieved to have the chance to do a bit of "free" patching up. In the case of risky moves, all by Jowa, his timing and psychology seemed to be spot on. Ohashi claims that is also a defining skill of elite players such as Ke Jie and Iyama Yuta. But either way, as already said, these dubious moves all reflected the players' possibly ropey evaluation of the overall position. But that's by far the toughest aspect of the game for humans.
As an example of human evaluation vs computer evaluation consider this position:
(;AB[rc][rd][qg][ph][oh][qb][ra][cp][eq][ck][ce][kc][ld][nd][od][pe][qe][qd]AW[dc]
[ed][ci][qi][pi][op][qc][mb][oe][of][pf][mf][rb][pd][pc][pb][oc]TR[ra]LB[nc:A][qa:B]SZ[19]
)
The triangled move was 35 in the game. Black had just before made the famous hane in the corner. It was famous because the Inoue school had studied it intensely and thought it was a secret weapon. Indeed, there was long an opinion that Black succeeded with this ploy, but maybe Jowa was not so impressed, and Golaxy certainly wasn't. It rated the Black hane 33 as a bad move that reduced Black's territorial lead from 5.7 to 2.1 points.
However, Jowa and Golaxy differed in their choice of reply. Jowa chose to force at A and then lived in the corner. Golaxy preferred to fight the ko with B and came up with the following line of play that gave the position where it thought it had gained 3.6 points. For a human even to just feel that White had made a gain here is surely problematical - just too much left up in the air. Jowa's (slightly) inferior move may well be regarded as correct for a human.
(;AB[rc][rd][qg][ph][oh][qb][cp][eq][ck][ce][kc][ld][nd][od][pe][qe][qd][ra][pa]
[sb][rp][cq][nh][kf][dq]AW[oe][of][pf][mf][pd][pc][pb][oc][dc][ed][ci][qi][pi][qq]
[oi][mj][op][qc][qp][mb][bp][co][dp][dn]SZ[19]
)
In all the comments I have read on the new AI style of play, many people have pointed out the new kinds of moves (e.g. high shoulder hits), there have been insightful characterisations of the style (e.g. an emphasis very early on on causing the opponent to be overconcentrated), and there have been new words (e.g. the tiger enclosure). But nowhere have I seen anything that suggests that humans have even begun to get a grip on how to evaluate positions such as the second one above. Everything seems to indicate humans are still satisfied (because they have to be) with Jowa's kind of response.
That should not surprise us if we look back at Shin Fuseki. There was great excitement at the time, and many books and articles purporting to elucidate the theory. But it didn't really take too long before even the excitable players more or less resumed normal service, and Shin Fuseki left barely a trace. Of course, in all the excitement new josekis emerged, just as some people are still getting very excited about AI josekis. But there at least we can perhaps say that the AI bots have really done little more than the best human joseki masters such as Go Seigen have already done.
In fact, I have continued to be encouraged by how well humans appear to have done overall in the AI comparisons. Jowa's reputation seems to have been relatively unscathed, at least in humans terms, and I gather there may even be surprises in store for how much better Dosaku shows up. I may report on that, and the Shusaku game, in due course, although I have to say that contributing to L19 feels a bit like equine necrophilia these days.