Page 2 of 2

Re: O Meien on AlphaGo Zero

Posted: Sat Jan 27, 2018 1:54 am
by Elom
From what I have have been able to ascertain, upon the advent of computer chess programs, chess players began to copy the computers 'cynical', somewhat materialistic conservative style, along with many draws. After all, the computers were beating them, so this must be the way to play chess.

Hold down the forward button to Alphazero's teaching games with Stockfish. It seems to play like a romantic (from our human perspective), opposite in many ways to that of a normal chess engine, using 'soft' move selection discarded many years ago for clever brute search, implying with a 'whole-board' positional strategy... Strong chess players slightly adjusted their style to match that of the best engines, and now it turns out that the best engines up until now may have been playing chess completely wrong (from our human perspective).

May we tread with caution in the wake of strong Go playing engines, but I admit that Alpha Zero's strength difference between the best humans is far above that any traditional chess engine could ever dream of achieving so...

Re: O Meien on AlphaGo Zero

Posted: Sat Jan 27, 2018 6:36 am
by Uberdude
djhbrown wrote:Random rollouts are, to me, the hallmark of MCTS; its very essence.

Alfie0 differs from Alfie Fan and Alfie etc in that it doesn't do them.

Maybe PHTS (Probabilistic Heuristic Tree Search) would be a more accurate name than MCTS for the kind of search Alfie0 does.
Yes, the MCTS name is rather unfortunate now as it's possible to to do it without the (semi-) random rollouts to a terminal game state. My understanding is when Remi Coulom coined the name you couldn't do it without the rollouts as the games it was used for didn't have a decent evaluation function, so you had to use rollouts, but now we have decent neural-network-based evaluation functions for non-terminal game positions you can still do the tree search UCT exploration algorithm aka MCTS but without rollouts.

However....
djhbrown wrote:The thing is, Alfie0 doesn't do random rollouts - which i regard as a significant technological development, one which takes her away from making weird moves like Master et al, and takes her away from going "On Tilt", and makes her overall behaviour much closer to the received wisdom of sages down the ages.
If by tilt you mean doing things like stupid sente moves when losing, the hypothesised rationale being the random rollouts give rise to the "Oh, the rollouts means maybe they don't answer this obvious sente move and then I reverse the game!" idea, then it's certainly an appealing idea, but unfortunately doesn't seem to be true. We only have a few games where AG0 is losing, but it does do 'on tilt' stupid sentes. See http://www.alphago-games.com/view/event ... 0/move/182 and next few moves until it resigns.

Re: O Meien on AlphaGo Zero

Posted: Sat Jan 27, 2018 7:43 am
by djhbrown
Elom wrote:chess ... positional strategy... tread with caution
it's very instructive that the experiences of top chess players precisely mirror those of top Go players - this suggests to me that whereas treading with caution is, with hindsight, something both should have done before, it looks to me that with Alfie0, they can now throw off any reluctance and dive right into what she has to say, despite the fact that she also gets desperate when behind, so my comment about her avoiding Tilt was unjustified - although, maybe if her programmers had set too high a threshold (too low a win%) for resigning, it is rational of her to become desperate when there's no hope!

In particular, whereas i felt at the time just before Alfie Fan came along that the apparent superior positional judgement of MCTS was, in fact, an inferior positional judgement compensated for by the combination of surprise element plus exhausting (albeit not exhaustive) tactical reading (and was roundly chastised for daring to utter such a heresy), i see Alfie0's PHTS+DCNN as almost the exact opposite of MCTS+DCNN, and i think Alfie0 IS endowed with superior positional judgement (gained through extensive reading), simply because, unlike her predecessors, she really does separate the wheat from the chaff because she doesn't go wandering off into the maze of random rollouts. One evidence for this is that she seems to understand moyos better than Master, and can see when she can live inside one, something that Swim had a good look at in the context of making sense of Jue Yi's New Move:
https://www.youtube.com/watch?v=KSVi8n4 ... S&index=27

So i reckon Go scholars have, in Alfie0, the diamond they dreamed of when clutching at the straw pebble of MCTS (sorry for all the mixed metaphors and allusions).

Because it is such a tiny microworld, Go doesn't have much to tell about intelligence, artificial or natural; but on the other hand, every journey of 1000 miles begins with a single step, and it could well be that the playpen prowess of Alfie0's PHTS could be a precious first step, alongside those of Simon and Minsky and McCarthy and Hofstadter et al:
https://www.youtube.com/watch?v=Ezz_lhY ... S&index=20

Alfie0 would be as lost as any of us without her parallel hardware which enables her to read both right to the end and wide enough to (mostly) not overlook anything important, but it could be that PHTS would be pretty effective with the help of a static evaluation function in domains where there is no end in sight, not even for an army of Tensor Flow machines.

A domain like macroeconomics, for example...