It is currently Thu Mar 28, 2024 4:47 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 418 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6 ... 21  Next
Author Message
Offline
 Post subject: Re: LZ's progression
Post #41 Posted: Fri May 25, 2018 7:39 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
TWOGTP matches :

1) LZ's networks
Matches between networks #0, #10, #20, #30, ..., #140.
For each network, two 100 games matches against network #70, which is the reference point.
For example, #0 never wins against #70 (1st run = 0 win out of 100 games, and 2nd run = 0 win),
and #140 almost always wins against #70 (1st run = 99 wins out of 100 games, and 2nd run = 99 wins).
twogtp, with LZ015, --visits=51 --noponder
For example, line 60-70, 29, 17 means network#60 won 29 games out of 100 against #70, and 17 games out of 100 in the second 100 games match.
Two odd things :
#20 won 1 game against #70 !
For #60, the two results vary a lot (29 and 17)
Attachment:
netw.jpg
netw.jpg [ 94.5 KiB | Viewed 12915 times ]



2) Zen7 vs LZ with networks #...
--visits=3201 --noponder for LZ and
-t 12 -T 1 -s 850 (gtp4zen)
Each match is 20 games (10 as B, 10 as W)
Zen takes about twice as much time as LZ
Attachment:
zen.jpg
zen.jpg [ 67.53 KiB | Viewed 12915 times ]


This post by Vargo was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #42 Posted: Sat May 26, 2018 1:02 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
For an idea of what these win ratios mean in terms of (weaker) human rank difference, check https://senseis.xmp.net/?EGFWinningStatistics. e.g a 3d beats a 6d about 8% whilst 4d beats 7d about 3%.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #43 Posted: Sat May 26, 2018 10:11 pm 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Network #144 (9e88) was promoted against #143 (057a) by winning 54.84% of its 403 games.

Is it more or less reproducible ? (I think official matches are with 3201 visits).
Here are five twogtp matches, 403 games per match (--visits=xxxx , --noponder)
Up to 200 visits, win% is fluctuating wildly (75% at 0 visit, and then less than 50% with few visits)
Then at 3201 visits, it's 52.35%, which is not bad.
(I won't make a lot of these 400 games matches with 3200 visits, because it takes a long time, even with good GPU)


Attachments:
9e88.jpg
9e88.jpg [ 81.49 KiB | Viewed 12800 times ]
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #44 Posted: Sun May 27, 2018 6:05 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
This may be a good time for buying a lottery ticket. :)
Quote:
luck will usually be there as that is still an easy way towards promotion. Most networks with >55% winrates will in fact be around 52% or so.
OC this assumes promotion is rare (is a kind of survivor bias). And the difference somewhat scales with the number of sims, so the advantage of the stronger net will likely be bigger in deeper searches.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #45 Posted: Mon May 28, 2018 2:36 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
The 74.68% win rate of network 9e88 against network 057a (with --visits=1) seemed weird…
Was it due to the relatively small sample (403 games) or to --visits=1 ?

Here are 9 twogtp matches (3x403 games, 3x1000 and 3x10000). I've kept the results reports generated by twogtp, here is one of these.
Attachment:
9e88_057v1_1.zip [26.19 KiB]
Downloaded 413 times
If someone is interested, I can upload the other ones.

Attachment:
9e88.jpg
9e88.jpg [ 161.29 KiB | Viewed 12679 times ]

I was expecting the max variation to decrease as the number of games increased… But going from 35% to 66% ???
Am I doing something wrong ? Has someone tried something similar ?
Parameters :
--gtp --weights=xxx --visits=1 --noponder -r 10 and
-games xxxxx -sgffile C:\... -auto -komi 7.5

Curiously, the overall win% is around...52% ;-)

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #46 Posted: Mon May 28, 2018 9:48 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Single visit games are completely different imo. The value net is not used at all, and policy changes are multiplied. If we expect the net to be slightly stronger at 3000 visits, that includes both policy and value improvements in unknown proportions - and as I mentioned the result only applies to that visit range. At much more visits the new net will appear even stronger, at less visits it can be slightly less stronger, and single visit games would be hard to predict.

But the variance anomaly does seem strange, I wonder if there was a problem with your setup. Or maybe the results are correlated - how many unique games were there in each set?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #47 Posted: Mon May 28, 2018 10:27 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
A last match : 20 x 5000 games between 9e88 and 057a with --visits=1 ( 2-3 hours of computer time, 10x5000 as B, 10x5000 as W)
Overall result : 9e88 wins 48631 out of 100000 games (48.631%)
For 9e88, min number of wins for a 5000 games match was 1920 (as B) , and the max was 3321 (as W)

I don't think there was a problem in these matches, the .dat files generated are OK, there was no crash or hangup.

Parameters were the same as before :
--gtp --weights=xxx --visits=1 --noponder -r 10 and
-games xxxxx -sgffile C:\... -auto -komi 7.5

In conclusion, I think you're right, "single visit games would be hard to predict" :scratch: :)

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #48 Posted: Mon May 28, 2018 11:28 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
I meant, did you check that each set of 5000 games actually contain 5000 different games, and not a lot of duplicates?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #49 Posted: Mon May 28, 2018 8:22 pm 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Ah, but you're right, some of the games are duplicates.
Is there a LZ or twogtp command to prevent that ?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #50 Posted: Tue May 29, 2018 3:55 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Vargo wrote:
Ah, but you're right, some of the games are duplicates.
Is there a LZ or twogtp command to prevent that ?
IIRC there is a command line option for noising even the policy, but then the strength will be different.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #51 Posted: Sun Jun 03, 2018 3:49 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
What is the influence of --visits=xxxx on the Elo or dan level of a given network ?
10 twogtp matches for network #145 (b691)
Each match is 100 games --noponder -komi 7.5 (no duplicate game in the reports)
For example, in the table below, b691 at 6401 visits won 64% of its games against b691 at 3201 visits.

Attachment:
b691.jpg
b691.jpg [ 67.78 KiB | Viewed 12367 times ]


CF. Uberdude's comment and his link about win% (https://senseis.xmp.net/?EGFWinningStatistics)

and this site, https://www.reddit.com/r/cbaduk/comment ... _of_lzero/ , according to which, b691 is 10 dan at 1601 playouts (visits ?)
At 6400 visits, b691 wins 93% against a 10 dan, woaw !
If someone is interested in the .dat reports, I can upload them.


This post by Vargo was liked by 3 people: Bill Spight, ez4u, Uberdude
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #52 Posted: Mon Jun 04, 2018 9:58 pm 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
What about the ELF weights (62b54 , or ELF V0)
How does it scale with the number of visits ?
At 1600 playouts, its strength is around 12 dan (cf. same site : https://www.reddit.com/r/cbaduk/comment ... _of_lzero/)

100 games match between 62b54 (6400 visits) and 62b54 (1600 visits)
twogtp, --noponder --visits=1601 (--visits=6401) -komi 7.5 , there was no duplicate game in the .dat report, all games won or lost by resignation.

Result : 90-10 (6400 visits won 41 games out of 50 as Black, and 49 as White)

Sample is small, but still... at least 1 stone stronger with 4 times more visits.

There was no ladder games in the 30-40 I've looked. If someone is interested in the games or the .dat report, I can upload them.


This post by Vargo was liked by 3 people: Bill Spight, ez4u, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #53 Posted: Mon Jun 04, 2018 10:43 pm 
Oza
User avatar

Posts: 2401
Location: Tokyo, Japan
Liked others: 2338
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
Very interesting stuff. Thanks for all your efforts! :bow:

BTW, how long did it take your machine to run the ELF match?

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #54 Posted: Tue Jun 05, 2018 1:22 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Thanks !
The 10 matches with b691 were run on 2 computers, one with 1x1080 and one with 2x1080Ti.
It took around 2 days, on and off.
For example, using both gpus, it takes about 3-5 minutes per game for 6400 visits against 3200 visits. (cf left of IMAGE)

The Elf match (6400 vs 1600) was with 1x1080, it took around 16 hours, each game 6-12 minutes (cf right of IMAGE)
Each match was two times 50 games (50 as B and 50 as W)

IMAGE

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #55 Posted: Thu Jun 07, 2018 9:59 pm 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
The ELF network (62b54) is stronger than #147 (10bc1) at the same visits count.
After some games to estimate the difference, I tried visits=12801 and 1601.

Result of a 100 games twogtp-match (LZ0.15 for both , komi=7.5) between

10bc1 (--visits=12801 --noponder) and
62b54 (--visits=1601 --noponder)

10bc1 wins 51-49 (23 as B and 28 as W, no duplicate game)

Average game length : 218 moves ; min=91, max=384
10bc1 takes 3.65 times more time (for 8 times more visits)


Again, if someone wants the two .dat reports or the games, I can upload them.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #56 Posted: Sat Jun 09, 2018 6:37 am 
Oza
User avatar

Posts: 2401
Location: Tokyo, Japan
Liked others: 2338
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
A zip of the games would be interesting.

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #57 Posted: Sat Jun 09, 2018 9:15 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
{Never mind.}

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #58 Posted: Tue Jun 12, 2018 6:14 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Sorry for the delay, I was away.

Here is the zip of the 100 games ELF network (62b54 at 1600 visits) vs network #147 (10bc1 at 12800 visits)
Attachment:
100 games.zip [91.79 KiB]
Downloaded 384 times


This post by Vargo was liked by 2 people: ez4u, Uberdude
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #59 Posted: Wed Jul 04, 2018 1:38 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Thanks to alreadydone, LZ can now handle high handicap (HERE)

Some H6, H7, and H8 matches between network #153 (e1d46) and network #80 (e1156)
According to this site , #153 is 10.4D, and #80 is exactly 4D

Sabaki matches, with 3200 visits, no pondering

at H6, #153 wins 2-0
at H7, #153 wins 2-1
at H8, #153 loses each time, playing first line moves,

So... new H8 matches with 12800 visits for #153 (#80 still at 3200 visits)
#153 manages to win sometimes !

Playing (and winning) H7 and H8 games against 4 Dan... Wow !


H7 # 153 loses
H7 # 153 wins
H7 # 153 wins

H8 # 153 wins

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #60 Posted: Wed Jul 04, 2018 3:13 am 
Dies in gote

Posts: 44
Liked others: 2
Was liked: 14
Rank: EGF 1 kyu
KGS: finity
Vargo wrote:
Thanks to alreadydone, LZ can now handle high handicap (HERE)

Playing (and winning) H7 and H8 games against 4 Dan... Wow !


Winning against "4d" LZ #80 might not be such an achievement, considering the LZ network is probably very bad playing as black in high handicap. It would be more interesting to see this against humans.

KGS Leela bot Petgo author added the functionality to the bot, there seem to be a couple of high handi games there alread, although it seems +Forf wins for some opponents -- might be related to the comments of some people that the patched version of Leela sometimes hangs. But the wins will be interesting to check out:

https://www.gokgs.com/gameArchives.jsp?user=petgo3

Hint: KGS archives with color highlight of wins/losses of given user with my Tampermonkey (Chrome plugin) script: http://joonaspihlajamaa.com/data/kgs_graphs.user.js

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 418 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6 ... 21  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group