LZ's progression

For discussing go computing, software announcements, etc.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

hoa803 wrote:It might provoke an interesting discussion - the folks on GitHub don't feel the time parity matches are a good measure of engine strength, but rather visits. I don't claim to totally understand the reasoning but it might be worth looking into.

i think these visit's tests are "сферический конь в вакууме", known in the west as a spherical cow. :lol:

P.S. For https://github.com/leela-zero/leela-zero/issues/2330#issuecomment-482729494 wright now:
#219 (-v 1600) vs elfv2 (-v 3200)
10 wins, 40 losses
50 games played.
:o

EDIT 1.

P.S.S For https://github.com/leela-zero/leela-zero/issues/2330#issuecomment-482957781 wright now:

#219 (-v 1600) vs elfv2 (-v 3200)
22 wins, 125 losses
157 games played.
:lol:

to be continued...
Last edited by nbc44 on Sun Apr 14, 2019 5:53 am, edited 1 time in total.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Time parity match with statistically significant result :salute:
(part II).
LZ0v17 #219 vs Elfv2
2x1080ti, 10s per move.
C:\APPS\l0gpu17\validation.exe -k 219elfv2-10s -s "0:10" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 10 1" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 10 1"

Code: Select all

#219 v elfv2 ( 400 games)
           wins        black       white
#219\  155 38.75%   64 37.21%   91 39.91%
elfv2  245 61.25%  108 62.79%  137 60.09%
                   172 43.00%  228 57.00%

(part III).
LZ0v17 #219 vs Elfv2
2x1080ti, 3s per move.
C:\APPS\l0gpu17\validation.exe -k 219elfv2-3s -s "0:10" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 3 1" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 3 1"

Code: Select all

#219 v elfv2 ( 403 games)
           wins        black       white
#219   153 37.97%   58 35.37%   95 39.75%
elfv2  250 62.03%  106 64.63%  144 60.25%
                   164 40.69%  239 59.31%
Attachments
219elfv2-3s.zip
(326.11 KiB) Downloaded 700 times
219elfv2-10s.zip
(342.38 KiB) Downloaded 711 times
hoa803
Beginner
Posts: 19
Joined: Tue Apr 02, 2019 7:12 pm
GD Posts: 0
Been thanked: 2 times

Re: LZ's progression

Post by hoa803 »

My suspicion is that time parity might be correct with two entirely separate machines, with the same hardware, and using ponder. Basically what seems to have done in the AlphaZero paper? If I recall they used 90 mins main time and 15s/move byo-yomi. I'm not as sure about doing it on a single machine with --noponder, however.

The reason I wonder is because I know that each move the NN makes is not independent of what it calculated on the previous move(s). We also know that the number of visits calculated on each position will vary wildly for a given amount of time. That will definitely add some serious randomness to the performance of an engine throughout a game. What I don't know is, does that even matter? (see: comment)

Ultimate I think it may be like football. Take the best teams in the world, say Manchester City, Barcelona, etc. Now change the rules of the game in some fundamental way. Maybe some other team will now be stronger.

That may be a poor analogy but basically I'm saying that when we declare an engine stronger given test XYZ, basically all we're saying is under those exact conditions only is that a true statement - especially when the engines are very similar in strength, like elf and lz appear to be at this point.
maf
Dies in gote
Posts: 30
Joined: Tue Aug 05, 2014 3:09 am
Rank: 3d
GD Posts: 0
Has thanked: 2 times
Been thanked: 9 times

Re: LZ's progression

Post by maf »

of course, just an inconvenient truth
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

hoa803 wrote:My suspicion is that time parity might be correct with two entirely separate machines, with the same hardware, and using ponder.

What do you think about test with one computer, ponder, one dedicated gpu for each side? I believe that this will be a more or less honest test.
splee99
Dies with sente
Posts: 101
Joined: Thu Nov 15, 2012 9:46 pm
Rank: KGS 2 D
GD Posts: 0
Has thanked: 2 times
Been thanked: 16 times

Re: LZ's progression

Post by splee99 »

I think one computer only has one interface bus between the CPU and the GPU. So that part is actually shared and the bot using less interface time will take more advantage.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

splee99 wrote:I think one computer only has one interface bus between the CPU and the GPU. So that part is actually shared and the bot using less interface time will take more advantage.

For 3200 visits and gpu(!) client? it's funny is not it?
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Visit parity match with statistically significant result :salute:
LZ0v17 #219 (1600 visits) vs Elfv2 (3200 visits) 2x1080ti
C:\APPS\l0gpu17\validation.exe -k 219-elfv2 -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 0 --gpu 1 --noponder -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 --noponder -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#219 v elfv2 ( 400 games)
           wins        black       white
#219   129 32.25%   60 31.58%   69 32.86%
elfv2  271 67.75%  130 68.42%  141 67.14%
                   190 47.50%  210 52.50%


In my case, everything is very bad.
Attachments
219-elfv2.zip
(331.93 KiB) Downloaded 675 times
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Visit parity match
LZ0v17 #219 (1600 visits) vs Elfv2 (3200 visits) 1x1080ti per side + ponder
Part1 - #219 (GPU0) vs Elfv2 (GPU1)
C:\APPS\l0gpu17\validation.exe -k 219-elfv2-1gpu -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 0 -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 1 -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#219 v elfv2 ( 208 games)
           wins        black       white
#219    74 35.58%   34 34.69%   40 36.36%
elfv2  134 64.42%   64 65.31%   70 63.64%
                    98 47.12%  110 52.88%

Part2 - #219 (GPU1) vs Elfv2 (GPU0)
C:\APPS\l0gpu17\validation.exe -k 219-elfv2-1gpu -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 1 -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#219 v elfv2 ( 208 games)
           wins        black       white
#219    81 38.94%   43 39.45%   38 38.38%
elfv2  127 61.06%   66 60.55%   61 61.62%
                   109 52.40%   99 47.60%

Summary:

#219 vs elfv2 (37,26%)
+155-261=0
:clap:
Attachments
219-elfv2-1gpu-part2.zip
(175.14 KiB) Downloaded 682 times
219-elfv2-1gpu-part1.zip
(173.27 KiB) Downloaded 721 times
Aram
Dies in gote
Posts: 53
Joined: Tue Jun 14, 2016 9:46 am
Rank: KGS 2k
GD Posts: 0
Has thanked: 3 times
Been thanked: 33 times

Re: LZ's progression

Post by Aram »

Is the difference in speed between the ELF network and the 40b network really 2x for you?
I know that theoretically that could be true, but if ive understood correctly, the difference isnt nearly that large in practise?


If you load the 40b network in leela, and write netbench 50000 and then load the elf network and write netbench 50000,
do you really play those 50.000 playouts in half the time with the ELF network?
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Aram wrote:Is the difference in speed between the ELF network and the 40b network really 2x for you?
I know that theoretically that could be true, but if ive understood correctly, the difference isnt nearly that large in practise?


If you load the 40b network in leela, and write netbench 50000 and then load the elf network and write netbench 50000,
do you really play those 50.000 playouts in half the time with the ELF network?


1). https://lifein19x19.com/viewtopic.php?p=242577#p242577

2).

Code: Select all

c:\apps\l0gpu17\leelaz.exe --precision single -t 24 --gpu 0 --gpu 1  -w C:\APPS\net\00ff08eb.gz

Leela: netbench 50000
50000 evaluations in 58.73 seconds -> 851 n/s

c:\apps\l0gpu17\leelaz.exe --precision single -t 24 --gpu 0 --gpu 1  -w C:\APPS\net\05dbca15.gz

Leela: netbench 50000
50000 evaluations in 29.81 seconds -> 1677 n/s
hoa803
Beginner
Posts: 19
Joined: Tue Apr 02, 2019 7:12 pm
GD Posts: 0
Been thanked: 2 times

Re: LZ's progression

Post by hoa803 »

NBC, if you're using visit parity you shouldn't use ponder. The time to reach that number of visits varies by position. Time parity matches can use ponder on separate hardware though, similar to how Alphago was tested.

There's a thread on GitHub with a visit "parity" (1600 vs 3200) match between 220 and elfv2. The result was inconclusive, seems to indicate they're about the same strength at that visit count.

Early in the match LZ appeared to be stronger with over 95% confidence, but by the end the result evened out.

Edit: a word
Last edited by hoa803 on Tue Apr 23, 2019 4:21 pm, edited 1 time in total.
splee99
Dies with sente
Posts: 101
Joined: Thu Nov 15, 2012 9:46 pm
Rank: KGS 2 D
GD Posts: 0
Has thanked: 2 times
Been thanked: 16 times

Re: LZ's progression

Post by splee99 »

My observation is that elfv2 is well trained to make sharp attacks in the early stage of a game. However it does have many blind spots in a complicated life death situations where LZ can take advantage of.
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: LZ's progression

Post by And »

What is the 15b strongest network? edb61bc2, 0a963117 or another?
hoa803
Beginner
Posts: 19
Joined: Tue Apr 02, 2019 7:12 pm
GD Posts: 0
Been thanked: 2 times

Re: LZ's progression

Post by hoa803 »

And wrote:What is the 15b strongest network? edb61bc2, 0a963117 or another?


There was a GitHub thread about a 15b trained on 40b awhile back. Unsure if anyone is still doing it.

https://github.com/leela-zero/leela-zero/issues/2192
Post Reply