Crazy Stone Deep Learning first impressions

oren · Post by **oren** » Thu May 19, 2016 9:38 am

I ended up not cleaning it up much at all, but here is a dump of what I did.

https://github.com/oren740/go-tools/tre ... e-analysis

dfan · Post by **dfan** » Thu May 19, 2016 9:49 am

Amtiskaw's script worked for me, even though I'm on Windows 7 and thus have an .xps file rather than an .oxps file. Apparently the two formats are close enough.

Amtiskaw · Post by **Amtiskaw** » Thu May 19, 2016 10:01 am

There were some bugs in it, but they should be fixed now. I hope. Ugh, regular expressions.

I'll probably add some metadata extraction, since the XPS contains stuff like usernames.

[EDIT: Done. I also added hotspots in the SGF for moves CrazyStone found particularly bad. I recommend the Sabaki SGF editor for easily jumping to these.]

LokBuddha · Post by **LokBuddha** » Thu May 19, 2016 10:33 am

Second game with 4 stones.
Strange that crazystone didn't resign and drag the game for 100+ moves

LokBuddha · Post by **LokBuddha** » Sat May 21, 2016 9:25 pm

another game 2 stones handicaps.

I won but this time, there is no problem, and crazy stones resign very early too. Too much aji keshi by CS. No global consideration from CS, and weak fighting tactically too... I don't think I played well either, quite a number of mistakes.

Can someone take a look at the game and review please?

Did I waste my $80 or my hardware too weak for Crazystone? I have i7-4790 3.6 ghz.

I'll try even game next.

Amtiskaw · Post by **Amtiskaw** » Sun May 22, 2016 1:44 am

LokBuddha wrote:my hardware too weak for Crazystone?

I'm under the impression that, when you select a level, it just uses whatever time needed to play "at that level".

Krama · Post by **Krama** » Sun May 22, 2016 4:29 am

Crazystone got demolished by Haylee. Such a disappointment.

dfan · Post by **dfan** » Wed Jun 08, 2016 4:41 am

dfan wrote:I played a couple of quick games at the 5 kyu level. In both cases I had a comfortable opening lead, got lazy, got tricked tactically in a big life and death situation, and lost. I learned plenty from going over the ensuing analyses, so that's great. I did feel that it didn't play a lot like a 5 kyu human - lots and lots of pushing over and over, very little tenuki. This was just two games though. If I have to play people (or Crazy Stone on a higher level) to get more interesting fuseki, that's okay. On the other hand, in one game it "misread" a relatively straightforward life and death issue in a human sort of way, and so did I; in the analysis it was happy to point out what it "missed". (Scare quotes are all because of course it would have gotten it right running at full strength.)

I've been working my way up through its ranks and have won my last five games (W+R vs 4k, B+8.5 vs 3k, B+R vs 3k, W+14.5 vs 2k, W+R vs 2k). I'm curious how its strength setting is calibrated.

Its play has gotten more interesting as the rank has increased, unsurprisingly. I still feel like it has a pronounced tendency to get into pushing battles. That's pretty much the only way in which I feel like I can take advantage of its botness; I can sometimes encourage it into a pushing battle that I think benefits me. It is also often eager to capture a few stones while I make nice thickness that I think outweighs the sacrifice. Of course these traits are true of many humans as well at this level.

I'm eager to see how it plays when I find a setting that beats me 50% of the time.

Going through its analysis interactively afterwards is extremely illuminating, and I feel like I have already learned a great deal, less about concrete variations, and more about what the interesting candidate moves (as we say in chess) are locally and where the biggest / most urgent area on the board is.

It's also nice to be able to play a relatively serious game in which I can think for a while without having to worry about finding a large uninterrupted chunk of time.

OtakuViking · Post by **OtakuViking** » Wed Jun 08, 2016 5:01 am

Haven't tried any of the low levels, only 7d so I dunno how good the lower levels are.

I just wanted to mention that upping the priority in joblist is a good idea if you want crazystone to be stronger at 7d level unlimited time (it also plays significantly faster I found)
Unlike programs like leela, Crazystone doesn't utilize the CPU very efficiently or fully. Leela even continues searching nodes while the opponent plays and you can see how many nodes its searched + you can sort of make it search a ton of nodes (thus upping its strength alot) by giving it alot of time. Then when it's read out many many notes you can force it to move. I wish CSDL had this sort of functionality. CSDL is certainly stronger than leela, which is as it should be, but leela has some stuff in its base/free UI that I wish CSDL had... need more customizability to play round with tbh.

Mike Novack · Post by **Mike Novack** » Wed Jun 08, 2016 6:07 am

dfan wrote:
I've been working my way up through its ranks and have won my last five games (W+R vs 4k, B+8.5 vs 3k, B+R vs 3k, W+14.5 vs 2k, W+R vs 2k). I'm curious how its strength setting is calibrated.

Are you playing with time controls?

When any of these programs are playing with time controls (X amount of time for Y moves) and so when coming up computes how much analysis they can do according to the power of the hardware the program finds itself running on then the strength levels cannot be absolute, just relative.

On the other hand, playing without time controls, setting how much analysis to use or what strength level to play at cannot know how much time will be required per move. Will simply use as much time as that depth of analysis requires when running on that hardware. For a given level of strength, hardware power and real time per move will be inversely proportional.

dfan · Post by **dfan** » Wed Jun 08, 2016 6:23 am

Mike Novack wrote:
dfan wrote:
I've been working my way up through its ranks and have won my last five games (W+R vs 4k, B+8.5 vs 3k, B+R vs 3k, W+14.5 vs 2k, W+R vs 2k). I'm curious how its strength setting is calibrated.
Are you playing with time controls?

No. When I choose a time control I am unable to choose the rank of the engine.

When any of these programs are playing with time controls (X amount of time for Y moves) and so when coming up computes how much analysis they can do according to the power of the hardware the program finds itself running on then the strength levels cannot be absolute, just relative.

On the other hand, playing without time controls, setting how much analysis to use or what strength level to play at cannot know how much time will be required per move. Will simply use as much time as that depth of analysis requires when running on that hardware. For a given level of strength, hardware power and real time per move will be inversely proportional.

Yes. I assumed that playing at "2k" would use a basically fixed amount of compute and just take as long as it needed to do that much work. On my computer, it plays essentially instantaneously.

It just occurred to me that you might have misunderstood what I meant by "I'm curious how its strength setting is calibrated." I meant that I am curious how it has been determined that a particular setting should be called 2k rather than 4k or 1d, not curious about the mechanism by which its strength is regulated (though that is interesting too, of course).

Kap · Post by **Kap** » Wed Jun 08, 2016 7:33 am

dfan wrote: Yes. I assumed that playing at "2k" would use a basically fixed amount of compute and just take as long as it needed to do that much work. On my computer, it plays essentially instantaneously.

It just occurred to me that you might have misunderstood what I meant by "I'm curious how its strength setting is calibrated." I meant that I am curious how it has been determined that a particular setting should be called 2k rather than 4k or 1d, not curious about the mechanism by which its strength is regulated (though that is interesting too, of course).

Relevant posts by Rémi:

Rémi wrote:
Kirby wrote:How are weaker levels calibrated? Is it purely a function of time spent per move?
No. Weaker levels play instantly.

All the levels up to 1k are pure neural networks. They play almost instantly. I collected game records of weak players from KGS, and produced neural networks that imitate them. So the weak levels are considerably more human-like than before. Strength was calibrated by playing games on KGS.

As you can see on the attached sgf, it produces a very human-like way of being weak. I am really happy with the result.

It is interesting that the 13k neural network was produced by learning from game records of 15k players. I suppose it may have become stronger because it plays instantly, and because it can read ladders better.

Rémi wrote:
Kirby wrote:Thanks, Rémi. That's very fascinating. I think making the weaker levels "naturally weak" is a great idea.

How do levels between 1d and 7d work?
1d and beyond is MCTS as usual, except that the neural network makes it a lot stronger.

Rémi wrote:
Marcel Grünauer wrote:This sounds very exciting!

On faster hardware, does "Crazy Stone Deep Learning" play better or just faster?

I'm asking because the iPad version has been said to have the same strength on all models.
The top levels are in fact defined as time per move, so they will be stronger with more powerful hardware.

from http://www.lifein19x19.com/forum/viewto ... 18&t=12342.

- Kap

dfan · Post by **dfan** » Wed Jun 08, 2016 8:00 am

Interesting, thank you! I am surprised that it is doing no MCTS at all, though I know that the neural-network-only version of AlphaGo was already dan level, so it makes sense. It does perhaps explain a little more how it missed some tactical tricks.

I am pleased to see that the strengths were calibrated by playing games on KGS. One possible reason that it seems to be stronger than the level of player it is imitating (at least in Rémi's example of the 13k setting being trained on 15k games) is a sort of "mob intelligence" principle; a thousand 15k players voting on moves will probably play better than a single 15k player playing by himself, and the training of a neural network effectively leads it to imitate a ton of players voting.

I look forward to seeing how it plays when I get to the 1d setting. I wonder whether its style will abruptly shift.

Bill Spight · Post by **Bill Spight** » Wed Jun 08, 2016 9:52 am

dfan wrote:ne possible reason that it seems to be stronger than the level of player it is imitating (at least in Rémi's example of the 13k setting being trained on 15k games) is a sort of "mob intelligence" principle; a thousand 15k players voting on moves will probably play better than a single 15k player playing by himself, and the training of a neural network effectively leads it to imitate a ton of players voting.

Good point!

CnP · Post by **CnP** » Thu Jun 09, 2016 1:51 am

dfan wrote: I look forward to seeing how it plays when I get to the 1d setting. I wonder whether its style will abruptly shift.

Just a minor example, here's an even game I played against CS on 3k. I'm black. White's move 16 seems dodgy and running the full strength analysis mode suggests P3 as a better move.. however I sort of lost all respect for CS 3k when it kept pushing from below with the sequence 22 - 34 instead of capturing the black stones earlier. After that I got sloppy, played some stupid moves in addition to my usual stupid moves

but ended up winning by 84.5. CS 1D doesn't seem to play like that (and the analysis mode would have captured the stones) so that's what I'm running it with (no point in reinforcing bad play).

Life In 19x19

Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions

Re: Crazy Stone Deep Learning first impressions