Life In 19x19

Posted: **Thu Dec 07, 2017 3:56 pm**

New entries in League D are Leela Zero with the netfork file from 2017-12-03 and Beancounter, my own attempt to write a go engine.

Leela vs. AQ

Code: Select all

    1. AQ 2.0.3                     12/16
    2. Leela 0.11.0 Beta 11          4/16

Configuration:

League A:

Code: Select all

    1. Leela 0.10.0                 22/24
    2. Rayon 4.6.0                  19/24
    3. Oakfoam 0.2.1 NG-06          18/24
    4. Hiratuka 10.37B (CPU)         9/24
    5. DarkForest v2 MCTS 1.0        7/24
    6  DarkGo 1.0                    5/24
    7. Pachi DCNN 11.99              4/24

Configuration:

League B:

Code: Select all

    1. Pachi DCNN 11.99             29/32
    2. Ray 9.0.1                    27/32
    3. MoGo 4.86                    21/32
    4. deltaGo 1.0.0                17/32
    5. Fuego 1.1                    17/32
    6. Leela Zero 0.1               15/32   
    7. Michi C-2 1.4.2               9/32
    8. Orego 7.08                    7/32
    9. GNU Go 3.8                    2/32

Configuration:

League C:

Code: Select all

    1. GNU Go 3.8                   24/28
    2. Hara 0.9                     18/28
    3. Dariush 3.1.5.7              16/28
    4. Indigo 2009                  15/28
    5. Matilda 1.24                 15/28
    6. Aya 6.34                     11/28
    7. Fudo Go 3.0                  11/28
    8. JrefBot 081016-2022           2/28

Configuration:

League D:

Code: Select all

    1. JrefBot 081016-2022          18/20             
    2. Iomrascálaí 0.3.2            17/20
    3. Crazy Patterns 0008-13       15/20
    4. Marcos Go 1.0                15/20
    5. AmiGo 1.8                    15/20
    6. Beancounter 0.1              10/20
    7. Leela Zero 0.6 (2017-12-03)   7/20
    8. Stop 0.9-005                  5/20
    9. GoTraxx 1.4.2                 4/20
   10. CopyBot 0.1                   2/20
   11. Brown 1.0                     2/20

Configuration:

Links:

Best,
Alex

Posted: **Fri Dec 08, 2017 12:16 pm**

as0770 wrote:

Code: Select all

    1. JrefBot 081016-2022          18/20             
    2. Iomrascálaí 0.3.2            17/20
    3. Crazy Patterns 0008-13       15/20
    4. Marcos Go 1.0                15/20
    5. AmiGo 1.8                    15/20
    6. Beancounter 0.1              10/20
    7. Leela Zero 0.6 (2017-12-03)   7/20
    8. Stop 0.9-005                  5/20
    9. GoTraxx 1.4.2                 4/20
   10. CopyBot 0.1                   2/20
   11. Brown 1.0                     2/20

Four more days of training for Leela Zero and this is the result:

Code: Select all

    1. JrefBot 081016-2022          18/20             
    2. Iomrascálaí 0.3.2            17/20
    3. Crazy Patterns 0008-13       15/20
    4. AmiGo 1.8                    14/20
    5. Marcos Go 1.0                13/20
    6. Leela Zero 0.8 (2017-12-07)  12/20
    7. Beancounter 0.1               8/20
    8. Stop 0.9-005                  5/20
    9. GoTraxx 1.4.2                 4/20
   10. CopyBot 0.1                   2/20
   11. Brown 1.0                     2/20

Posted: **Sat Dec 09, 2017 5:32 am**

as0770 wrote:...The difference between ponder on and ponder off matches is nearly irrelevant compared to all the other influences and for sure interfered by the statistical fluctuation when rating the engines with less than 100 games.

For same strength engines sparring ponder is significant parameter...
But most significant (for the game strength) difference of these synthetic tests and real prof. games is in the time control parameter values.

Posted: **Sun Dec 10, 2017 9:29 am**

q30 wrote:For same strength engines sparring ponder is significant parameter...

Based on what tests? I played thousands of computer AI games, even if one engine is pondering, and one not, you need hundrets of games to see the difference.

Posted: **Sat Dec 16, 2017 1:25 am**

as0770 wrote: Based on what tests? ...

On tests with close to real game parameters (2'' on move, for example).

Posted: **Sun Dec 17, 2017 11:56 pm**

q30 wrote:
as0770 wrote: Based on what tests? ...
On tests with close to real game parameters (2'' on move, for example).

In chess doubling the calculating time will make Engines stronger by 60 ELO Points. Engines of the same strength have about 50% ponderhits. So the difference of a pondering Engine to a non pondering Engine is about 30 ELO. You need more than 1000 games to measure an ELO difference of 30 ELO. And we are talking about ponder vs. no ponder. In Go the difference is even smaller because there are less ponderhits. Also the ELO gap between engines and their pondering ELO gain is much less 30 ELO. So after all it is simple impossible to meassure a difference in the ELO gain with pomdering with such a small amount of games.

Posted: **Sat Dec 23, 2017 12:51 am**

as0770 wrote: In chess doubling the calculating time will make Engines stronger by 60 ELO Points...

It's not absolutely linear strength(time) dependency. So strength increasing value by doubling time will depend on absolute time value.
Pondering may not affect only on simple MC engines, such MoGo, where, for example, increasing by "--earlyCut 0" thinking time doesn't make the engine game stronger.
I'll test pondering effect on MoGo, Pachi, Ray and Leela soon.

Posted: **Tue Dec 26, 2017 8:34 am**

q30 wrote:
as0770 wrote: In chess doubling the calculating time will make Engines stronger by 60 ELO Points...
It's not absolutely linear strength(time) dependency. So strength increasing value by doubling time will depend on absolute time value

30 years of statistics in computerchess say something different.

Posted: **Wed Dec 27, 2017 9:37 am**

Leela Zero is much stronger now, can you test it again please?

Posted: **Wed Dec 27, 2017 10:06 am**

Cyan wrote:Leela Zero is much stronger now, can you test it again please?

I'd love to do so, but I am on vacations

I'll be back in 1-2 weeks.

Posted: **Wed Dec 27, 2017 1:03 pm**

as0770 wrote:
q30 wrote:
as0770 wrote: In chess doubling the calculating time will make Engines stronger by 60 ELO Points...
It's not absolutely linear strength(time) dependency. So strength increasing value by doubling time will depend on absolute time value
30 years of statistics in computerchess say something different.

Are you sure? I'm pretty sure I recall more than one case of computer chess statistics, one informally posted on a forum, and one from some published paper, indicating slightly sublinear elo gains with log(time). Possibly other attempts I didn't see got more linear results, perhaps it depends a little on the engine and perhaps you only see it if you test a wide enough range.

The order of magnitude differences were like a +35 elo difference for a given time multiplication factor becoming a +25 elo difference for that time multiplication factor between the ends of a range that was 5 or 6 orders of magnitude wide, or something like that (those numbers are all made up, I'm just trying to convey the rough scale of things that I fuzzily recall). So, not a big difference, but still a bit nonlinear. Unless I just made up those memories.

Posted: **Thu Dec 28, 2017 10:12 am**

lightvector wrote:Are you sure? I'm pretty sure I recall more than one case of computer chess statistics, one informally posted on a forum, and one from some published paper, indicating slightly sublinear elo gains with log(time). Possibly other attempts I didn't see got more linear results, perhaps it depends a little on the engine and perhaps you only see it if you test a wide enough range.

The order of magnitude differences were like a +35 elo difference for a given time multiplication factor becoming a +25 elo difference for that time multiplication factor between the ends of a range that was 5 or 6 orders of magnitude wide, or something like that (those numbers are all made up, I'm just trying to convey the rough scale of things that I fuzzily recall). So, not a big difference, but still a bit nonlinear. Unless I just made up those memories.

Indeed this is true. I defalcated that with faster hardware or longer timecontrol there is a slight decrase in the ELO gain. But we are talking about a decrase from a 70 ELO gain on an 286 30 years ago to a 60, maybe 50 ELO gain nowadays. This has something to do with the increasing amount of draws with nearly perfect play.

Posted: **Fri Dec 29, 2017 8:15 am**

The results (with pondering - without pondering):
MoGo 3 - 1;
Pachi 3 - 1;
Ray 3 - 1;
Leela 3 - 1;
in all 12 - 4 (details).
I don't know, what about quantitatively results (in ELO), but definitely there is qualitative effect, and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.

Posted: **Mon Jan 01, 2018 4:29 pm**

q30 wrote:... and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.

So we agree that the question is only relevant in matches between engines where one is able to ponder and the other engine is not? Fine. In your ratinglist Hiratuka is the only engine that does not ponder. Once you had DarkGo which moves instandly, Hira is limited to one minute. So for both the question of the absolute timecontrol and CPU power is much more relevant than the question of running it in ponder on or ponder off matches. And still our results are similar to equal. So where is your point always claiming others as "synthetic" results? Your testing does not become more precious by depreciating others. For me it does not make any sense to test engines made for GPU support without GPU, so I have to play ponder off to get realistic results.

Posted: **Wed Jan 03, 2018 6:48 am**

Finally a Leela Zero update. v0.9 with the network file from 2018.1.1 makes it into League B and is now stronger than the human trained version 0.1 which was placed 6th in League B with 15 points against the same opponents.

Leela vs. AQ

Code: Select all

    1. AQ 2.0.3                     12/16
    2. Leela 0.11.0 Beta 11          4/16

Configuration:

League A:

Code: Select all

    1. Leela 0.10.0                 22/24
    2. Rayon 4.6.0                  19/24
    3. Oakfoam 0.2.1 NG-06          18/24
    4. Hiratuka 10.37B (CPU)         9/24
    5. DarkForest v2 MCTS 1.0        7/24
    6  DarkGo 1.0                    5/24
    7. Pachi DCNN 11.99              4/24

Configuration:

League B:

Code: Select all

    1. Ray 9.0.1                    29/32
    2. Pachi DCNN 11.99             28/32
    3. Leela Zero 0.9 (2018.01.01)  19/32
    4. MoGo 4.86                    18/32
    5. deltaGo 1.0.0                17/32
    6. Fuego 1.1                    15/32
    7. Michi C-2 1.4.2               8/32
    8. Orego 7.08                    8/32
    9. GNU Go 3.8                    2/32

Configuration:

League C:

Code: Select all

    1. GNU Go 3.8                   24/28
    2. Hara 0.9                     18/28
    3. Dariush 3.1.5.7              16/28
    4. Indigo 2009                  15/28
    5. Matilda 1.24                 15/28
    6. Aya 6.34                     11/28
    7. Fudo Go 3.0                  11/28
    8. JrefBot 081016-2022           2/28

Configuration:

League D:

Code: Select all

    1. JrefBot 081016-2022          16/18             
    2. Iomrascálaí 0.3.2            15/18
    3. Crazy Patterns 0008-13       13/18
    4. Marcos Go 1.0                13/18
    5. AmiGo 1.8                    13/18
    6. Beancounter 0.1               8/18
    7. Stop 0.9-005                  5/18
    8. GoTraxx 1.4.2                 3/18
    0. CopyBot 0.1                   2/18
   10. Brown 1.0                     2/18

Configuration:

Links:

Best,
Alex

Life In 19x19

Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament