It is currently Thu Mar 28, 2024 2:12 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 418 posts ]  Go to page Previous  1 ... 13, 14, 15, 16, 17, 18, 19 ... 21  Next
Author Message
Offline
 Post subject: Re: LZ's progression
Post #301 Posted: Thu Feb 07, 2019 3:43 am 
Gosei
User avatar

Posts: 1348
Liked others: 202
Was liked: 203
does anyone know where to download LZ ZQ elf-2, LZ ZQ elf-5 ?
https://github.com/breakwa11/GoAIRatings

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #302 Posted: Wed Feb 13, 2019 9:56 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
20 game match at time parity between
LZ0.16 #204 and LZ0.16 Elfv2
1x1080, twogtp 1.5.0, 5min per side and per game.

Elfv2 wins 13-7
All games by resignation, no error, no duplicate game.
Stats :
Attachment:
204 v elfv2.gif
204 v elfv2.gif [ 52.85 KiB | Viewed 11056 times ]

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #303 Posted: Wed Feb 13, 2019 10:07 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
Vargo, about how many playouts per move is this? The official LZ test was 1600 each and LZ won 65%.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #304 Posted: Wed Feb 13, 2019 10:44 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Uberdude wrote:
how many playouts per move is this?
5 min per side and per game is, in fact, ~3.5 min/game effectively used, and is ~2s/move. It's similar to -v 1600 for #204 and -v 3000 for Elfv2 , all this with 1x1080.


This post by Vargo was liked by: Uberdude
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #305 Posted: Thu Feb 14, 2019 9:39 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Another 10 game match between LZ0.16_#204 and LZ0.16_Elfv2 at time parity
2x1080Ti, 5 minutes per side per game (probably similar to -v 5000 for #204 and to -v 9000 for Elfv2)
twogtp 1.5.0, no pondering, komi 7.5, no duplicate game, no error.
Result : Elfv2 wins 7-3.

The games :
Attachment:
204_Elfv2.zip [9.67 KiB]
Downloaded 482 times
I've used "-alternate", so, #204 is B in the even numbered games, and #204 is W for the odd numbers.
(#204 only won the games numbered 1, 3, and 6)


The command lines and the stats :
Attachment:
elfv2v204.gif
elfv2v204.gif [ 50.97 KiB | Viewed 10996 times ]

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #306 Posted: Sun Feb 17, 2019 9:47 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
LZ0.16_#204 vs LZ0.16_Elfv2 2x1080Ti, 3s per move:
gogui-twogtp -black "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05d10f27.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -white "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05dbca15.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -games 100 -sgffile 204-elfv2 -auto -time 1s+4s/1 -komi 7.5 -verbose
gogui-twogtp -white "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05d10f27.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -black "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05dbca15.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -games 100 -sgffile elfv2-204 -auto -time 1s+4s/1 -komi 7.5 -verbose

Nothing interesting:

Code:
+28-72=0 (as black)
+34-66=0 (as white)

Total: +62-138=0


Attachments:
204-elfv2-stat.zip [5.72 KiB]
Downloaded 471 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #307 Posted: Tue Feb 19, 2019 3:10 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
nbc44 wrote:
Nothing interesting:
Why do you say that ? I find it very interesting, particularly considering it's 200 games :tmbup: !
________________________________________________________________________________________

New network #205

40 game match #205 v. Elfv2

1x1080, 5min per side and per game, no pondering, komi 7.5

Elfv2 wins 25-15 (62.5 %)

40 games :
Attachment:
elfV2_205.zip [34.9 KiB]
Downloaded 469 times

Command lines and stats (205 is B) :
Attachment:
Elfv2_205B.gif
Elfv2_205B.gif [ 96.14 KiB | Viewed 10694 times ]
Command lines and stats (205 is W) :
Attachment:
Elfv2_205W.gif
Elfv2_205W.gif [ 99.14 KiB | Viewed 10694 times ]

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #308 Posted: Tue Feb 19, 2019 9:15 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Vargo wrote:
Why do you say that ? I find it very interesting, particularly considering it's 200 games :tmbup: !

I suppose the test result is predetermined.

Long test now:
LZ0.16_#205 vs LZ0.16_Elfv2 - 2x1080Ti, 120s (wow!) per move, (it will be 10 games):

+1-4=0 (#205 is black)
+1-4=0 (#205 is white)

Elfv2 wins 8-2 (80 %)

P.S.
Dragon tail loss :o :


Attachments:
File comment: logs-part2
elfv2-205.zip [1.21 MiB]
Downloaded 478 times
File comment: logs-part1
205-elfv2.zip [1.26 MiB]
Downloaded 451 times
games.zip [10.1 KiB]
Downloaded 428 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #309 Posted: Sun Mar 03, 2019 8:25 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
In another thread, @jlt wrote an interesting comment :
Quote:
... I would be surprised if, for some n, LeelaZero(n) didn't beat LeelaZero(n-10) more than 50% of the time.


The last 40b network is #207, it's now 50 networks away from the last 15b, and 30+ networks from the last 20b.

20 game matches LZ(n) v. LZ(n-10) at time parity, 3 min/game and /side, 1x1080, komi 7.5, no pondering, LZ0.16, twogtp 1.5.0.

#207 v. #197 --> 12-8 (40b v. 40b)
#197 v. #187 --> 12-8 (40b v. 40b)
#187 v. #177 --> 15-5 (40b v. 40b)
#177 v. #167 --> 13-7 (40b v. 20b)
#167 v. #157 --> 5-15 (20b v. 15b)

And one more match : LZ(n) v. LZ(n-50)

#207 v. #157 --> 15-5 (40b v. 15b)

All games by resignation, no error, no duplicate game.

Average time was around 1.3 sec/move.

Below, the little hands point the networks #157,167,177, etc.
Attachment:
elo2.gif
elo2.gif [ 27.31 KiB | Viewed 10360 times ]

If someone wants the games or the stats, I'll upload them.


This post by Vargo was liked by: Waylon
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #310 Posted: Sun Mar 03, 2019 8:32 am 
Gosei
User avatar

Posts: 1753
Liked others: 177
Was liked: 491
Yes, I should have added the condition "if LZ(n) and LZ(n-10) are networks of the same size". Changing the network size introduces some discontinuity. When 20-block networks were introduced, results were disappointing, that's why the LeelaZero project shifted to 40 blocks rather quickly.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #311 Posted: Sun Mar 03, 2019 9:45 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
You're right, 15b #157 was a turning point, and 20b #158 was weaker.

Another 20 game match (just finished, with the same parameters) :
LZ(n) v. LZ(n-49)

#207 v. #158 --> 19-1 (40b v. 20b)

Not very surprising, but still... it's hard to pretend that LZ doesn't progress anymore ;-)

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #312 Posted: Mon Mar 04, 2019 10:41 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
100 game match : LZ(today) v. LZ(1 year ago)

One year ago, the best LZ network was #90 (6x128)
2 minutes per game and side, LZ0.16, twogtp 1.5.0 no pondering, komi 7.5, gpu : 1x1080

Try to guess the result :scratch:
NB. Because of the "-alternate" command, #207 is always named B, even though it was W 50 times.
Attachment:
90.jpg
90.jpg [ 331.19 KiB | Viewed 10226 times ]

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #313 Posted: Sat Mar 09, 2019 10:46 pm 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
What's the effect of the number of visits on a given network ?
For example, what would be the score of LZ#207 --visits=801 v. LZ#207 --visits=1601 ?

I ran such a match yesterday (#207 with --visits=1, --visits=401, --visits=801, --visits=1601, --visits=3201) but... the results were inconclusive, more than half the games were duplicates :sad: :sad: :sad:
Attachment:
dup.gif
dup.gif [ 46.78 KiB | Viewed 9931 times ]
probably because #207 knows all the tricks of #207 ;-)

______________________________________________________________________________

Anyway, there's a new network, #208.
20 game matches : #208 with various visits counts, and -m 40
-m 40 is used to have a bit more randomness in the first 40 moves, and so, avoid duplicate games.

Code:
gogui-twogtp -black "C:\PATH TO LZ\leelaz.exe --gtp --weights=C:\PATH TO NETWORKS\208.gz --noponder -m 40 -v yyyy" -white "C:\PATH TO LZ\leelaz.exe --gtp --weights=C:\PATH TO NETWORKS\208.gz --noponder -m 40 -v zzzz" -games 20 -sgffile XXX -auto -komi 7.5 -alternate
twogtp 1.5.0, LZ0.16, gpu:1x1080
no duplicate game, no error.

time/move seems to scale linearly :
-v 1 : ~0 sec/move
-v 401 : ~0.8
-v 801 : ~1.5
-v 1601 : ~3
-v 3201 : ~5 to 6


Results :
Attachment:
208.gif
208.gif [ 9.4 KiB | Viewed 9931 times ]



If someone wants all the stats (times, lengthes, etc) , I'll upload them.


All the games :
The smallest number of visits is always B in the even numbered games (and W in the odd ones)
for example, 208_401_801-17 is game number 17 between #208 with 400 visits and # 208 with 800 visits. 400 visits is W

Attachment:
games.zip [143.93 KiB]
Downloaded 455 times

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #314 Posted: Sun Mar 10, 2019 10:37 am 
Dies in gote

Posts: 30
Liked others: 2
Was liked: 9
Rank: 3d
Did a quick test using LZ207, p100 vs p1000, got 0:20. Nothing surprising, just fyi.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #315 Posted: Mon Mar 11, 2019 11:52 am 
Gosei
User avatar

Posts: 1348
Liked others: 202
Was liked: 203
several matches 25x25, nets received by the program https://drive.google.com/open?id=1bgkVB ... oXHUdDuqt7,
https://github.com/leela-zero/leela-zero/issues/2240, 10sec/move, cpuonly, gogui-twogtp:
LM 192x15 GX89(25x25) - LZ 40x256 #205(25x25) 25:15
LZ 192x15 f438268e(25x25) - LZ 40x256 #205(25x25) 18:22
elf v2 256x20(25x25) - LZ 40x256 #205(25x25) 17:23, black elf all parties (11) won because of the ladder
converted minigo(25x25) 000930-goliath and 000990-cormorant do not work in gogui and sabaki.
Can someone with a powerful gpu make a couple of matches?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #316 Posted: Tue Mar 12, 2019 5:16 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Here are some more 20 game matches of #208 v. #208, with --visits=6401

Same parameters, except for --gpu 0 --gpu 1 (2x1080Ti)
It shouldn't change anything.
No error , no duplicate game.

So, same table as before, with an extra line (6401 → ...)
Attachment:
6401.gif
6401.gif [ 12.09 KiB | Viewed 10474 times ]


Seems like more visits really makes a difference, I find the score of 6401 v. 801 specially harsh :o !


The games between -v 6401 and -v 3201 (3201 is B in the even numbered games):
Attachment:
208_3201v6401.zip [17.77 KiB]
Downloaded 432 times


The stats for -v 6401 vs -v 3201 :
Attachment:
stats.gif
stats.gif [ 199.73 KiB | Viewed 10419 times ]

If someone wants the other stats or games, I can upload them.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #317 Posted: Tue Mar 12, 2019 3:56 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Vargo wrote:
Seems like more visits really makes a difference, I find the score of 6401 v. 801 specially harsh :o !
IIRC similar tests were posted on github a year ago, and that time double playouts seemed to give roughly 75% winrate. This coincides with performance distributions about one standard deviation apart, which in turn can explain quadruple and octuple visits behaviour (3sd->98%, though doubling visits is not the same as doubling playouts, and at high visits the relations may change as well).


This post by moha was liked by: maf
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #318 Posted: Wed Mar 13, 2019 2:14 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Time parity match.
LZ0.16 XXX and LZ0.16 Elfv2
2x1080ti, 60s per move.
C:\APPS\l0gpu16\validation.exe -n C:\APPS\net\XXX.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu16\leelaz --gtp-command "time_settings 1 61 1" -- C:\APPS\l0gpu16\leelaz --gtp-command "time_settings 1 61 1" -k XXX-elfv2

1). #205
Code:
#205 v elfv2 ( 26 games)
           wins        black       white
#205    12 46.15%    2 50.00%   10 45.45%
elfv2   14 53.85%    2 50.00%   12 54.55%
                     4 15.38%   22 84.62%

2). #207
Code:
#207 v elfv2 ( 26 games)
           wins        black       white
#207    13 50.00%    7 53.85%    6 46.15%
elfv2   13 50.00%    6 46.15%    7 53.85%
                    13 50.00%   13 50.00%

3). #208
Code:
#208 v elfv2 ( 26 games)
           wins         black      white
#208     4 15.38%    1  9.09%    3 20.00%
elfv2   22 84.62%   10 90.91%   12 80.00%
                    11 42.31%   15 57.69%

4). #210
in progress...


Attachments:
l0-elfv2.zip [69.66 KiB]
Downloaded 417 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #319 Posted: Sun Mar 17, 2019 3:02 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
New network #212

Quick test about @jlt's law ;-)
(reminder : LZ#(n) is stronger than LZ#(n-10) at blocks and time parity)

added parameters -m 20, to avoid duplicate games, and -v 1601, to "standardize" the test.

50 games, no duplicate, no error.
Result : #212 wins 32-18 (64%)
__________________________________________________________________________

And now, how about a little controversy... :D :D

If #n wins 55% of its games against #n-1, and
If #n-1 wins 55% of its games against #n-2,and
...
and #n-9 wins 55% of its games against #n-10

#n should win 88% of its games against #n-10, but in this test, it wins only 64%...


In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55%


Some caveats : -m 20 can alter results, and 50 games is not enough, but still, I remember @moha spoke about the primary source of Elo inflation being the amount of luck accumulated by the new networks in test matches. I think he was right.

Code:
gogui-twogtp -black "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\212.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -white "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\202.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -games 50 -sgffile 212_202 -auto -komi 7.5 -alternate

The 50 games :
Attachment:
212_202.zip [43.7 KiB]
Downloaded 426 times
EDIT : #212 is B in the even numbered games, and W in the odd ones.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #320 Posted: Sun Mar 17, 2019 6:24 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
Vargo wrote:
If #n wins 55% of its games against #n-1, and
If #n-1 wins 55% of its games against #n-2,and
...
and #n-9 wins 55% of its games against #n-10

#n should win 88% of its games against #n-10, but in this test, it wins only 64%...

In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55%



Why should it? That's an assumption e.g. Elo rating systems take to make the problem simple enough to tackle, but there's no logical 'should' about it. If Man City beat Arsenal 3-0 and Arsenal beat Chelsea 2-0 we can't say Man City should beat Chelsea 5-0.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 418 posts ]  Go to page Previous  1 ... 13, 14, 15, 16, 17, 18, 19 ... 21  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group