Life In 19x19 http://www.lifein19x19.com/ |
|
LZ's progression http://www.lifein19x19.com/viewtopic.php?f=18&t=15718 |
Page 17 of 21 |
Author: | Vargo [ Sun Mar 17, 2019 7:05 am ] |
Post subject: | Re: LZ's progression |
55% winrate means A wins 55 games out of 100, A wins 55 when B wins 45. So, A wins 55/45 times more games than B . It's true, because (55/45)*45=55. If B wins 55% over C, B wins 55/45 times more games than C. A wins 55/45 times more games than B, who wins 55/45 times more games than C, so A wins (55/45)*(55/45) times more games than C. etc. A10 wins (55/45)^10 times more games than A1 A10 wins 7.44 times more games than A1 When A1 wins 1 game, A10 wins 7.44 games, so A10 wins 7.44 out of 8.44, that's 88% For example, with an obvious case, if An wins 50% over An-1, after 10 networks, it leads to 1^10=1, and 1 out of (1+1) is still 50%. With 50.1%, it leads to (50.1/49.9)^10=1.04, and 1.04/2.04 =~ 51 % which looks reasonable. With 45% , after 10 networks, we would get ~12% If we could have 10 consecutive networks with 60% winrate over the preceding one, we'd have 98.3% winrate for A10 over A1. I hope it's understandable, I don't really speak english (as you've probably noticed ) |
Author: | ez4u [ Sun Mar 17, 2019 7:19 am ] |
Post subject: | Re: LZ's progression |
Interesting discussion on Github of experimental version of LZ that uses alternative logic to select the best play with the same nets. See https://github.com/leela-zero/leela-zero/issues/2282 for the details and links to code or compiled windows downloads. The experimental version is showing ~60% winning percentage in matches with various visit levels versus normal LZ. |
Author: | Tryss [ Sun Mar 17, 2019 9:18 am ] |
Post subject: | Re: LZ's progression |
Vargo wrote: I hope it's understandable, I don't really speak english (as you've probably noticed ) It's understandable, but there's no reasons for this to be true. It's not because you have a 1:a ratio against player A and that player A has a 1:b ratio against player B than you must have a 1:(a*b) ratio against B It's already not true for this simple following game : Player A roll two 8 sided dices, player B roll two 6 sided dices, and player C roll two 4 sided dices. The one with the biggest sum win, and if there's equality, the one with the smallest dices win. In this simple game, A has 63.93% chance to win against B (1581/2304, or a 1:1.773 ratio), B has 69.10% chance to win against C (398/576 or a 1:2.236 ratio), and A has 82.42% chance to win against C (844/1024 or a 1:4.789 ratio). But what you propose would give A 79.85% chance to win against C (1:3.964). |
Author: | Vargo [ Sun Mar 17, 2019 9:45 am ] |
Post subject: | Re: LZ's progression |
ez4u wrote: Player A roll two 8 sided dices, ... As I said, a little controversy... I'm not home now, but I'm looking forward to trying your dices |
Author: | Bill Spight [ Sun Mar 17, 2019 9:52 am ] |
Post subject: | Re: LZ's progression |
Vargo wrote: 55% winrate means A wins 55 games out of 100, A wins 55 when B wins 45. So, A wins 55/45 times more games than B . It's true, because (55/45)*45=55. If B wins 55% over C, B wins 55/45 times more games than C. A wins 55/45 times more games than B, who wins 55/45 times more games than C, so A wins (55/45)*(55/45) times more games than C. etc. {snip} I hope it's understandable, I don't really speak english (as you've probably noticed ) Yes, it is understandable and clear. However, there is an underlying assumption that the difference between the abilities of A, B, and C to win games is reducible to a single number. (There is also the assumption of perfect accuracy of the win rate estimates, i.e., no luck, which has already been alluded to.) But as we know go requires a number of different skills, which means that skill at go may not be reduced to a single number. And that means that transitivity does not hold. Player A may beat player B more than half the time, player B may beat player C more than half the time, and player C may beat player A more than half the time. Now, transitivity holds closely enough in go that we can have different ranks, each of which covers a range of ratings, and make pretty good predictions of the handicap between players of different ranks which will make the win rates around 50%. But, OC, for specific individual pairings the recommended handicap may not do that. One thing that makes the ranking system robust is that each player plays against a variety of different players with different levels of ability at different skills. Self play does not do that, and so, IMO, does not produce robust results. To give a possibly related example of how multidimensionality can reduce the degree of progress, let's suppose that we are measuring progress in two independent dimensions. Suppose that B makes one unit of progress by comparison with A, and C makes the same unit of progress by comparison with B, but in the orthogonal direction to that of the progress between A and B. Then how much progress does C make with regard to A? Not 2 units, but √2 units. |
Author: | Tryss [ Sun Mar 17, 2019 10:01 am ] |
Post subject: | Re: LZ's progression |
Vargo wrote: ez4u wrote: Player A roll two 8 sided dices, ... As I said, a little controversy... I'm not home now, but I'm looking forward to trying your dices 8 sided dices are common in tabletop gaming : 4, 6, 8, 10, 12 and 20 sided dices are usual But there exist more exotic dices |
Author: | Vargo [ Sun Mar 17, 2019 11:08 am ] |
Post subject: | Re: LZ's progression |
@Tryss Your dice are beautiful. I had some such. But in your example, A-B play a certain game, B-C play another game, because the dice are different, and A-C a third different game. In this case, I'm not surprised that winrates aren't transitive. Bill Spight wrote: Now, transitivity holds closely enough in go that we can have different ranks,... Yes, it's true, fortunately !Bill Spight wrote: ...Self play does not do that, and so, IMO, does not produce robust results. It's true too, unfortunately.
|
Author: | Tryss [ Sun Mar 17, 2019 11:16 am ] |
Post subject: | Re: LZ's progression |
No, it's the same game : roll your dices, the one with the better score win Player A is just a stronger player than B or C |
Author: | Bill Spight [ Sun Mar 17, 2019 11:26 am ] |
Post subject: | Re: LZ's progression |
Tryss wrote: Vargo wrote: I hope it's understandable, I don't really speak english (as you've probably noticed ) It's understandable, but there's no reasons for this to be true. It's not because you have a 1:a ratio against player A and that player A has a 1:b ratio against player B than you must have a 1:(a*b) ratio against B It's already not true for this simple following game : Player A roll two 8 sided dices, player B roll two 6 sided dices, and player C roll two 4 sided dices. The one with the biggest sum win, and if there's equality, the one with the smallest dices win. In this simple game, A has 63.93% chance to win against B (1581/2304, or a 1:1.773 ratio), B has 69.10% chance to win against C (398/576 or a 1:2.236 ratio), and A has 82.42% chance to win against C (844/1024 or a 1:4.789 ratio). But what you propose would give A 79.85% chance to win against C (1:3.964). I suppose that the faces of each die are numbered consecutively from 1 to the number of faces. Let's suppose that each player rolls only one die. Then A has an 9/16 chance (56.25%) to beat B, with odds of 9:7, and B has a 7/12 chance (58.33%) to beat C, with odds of 7:5. Multiplying the odds gives A odds of 9:5 to beat C, or 9/14 of the time (64.29%). But A beats C 11/16 of the time (81.25%), with odds of 11:5. |
Author: | moha [ Sun Mar 17, 2019 11:23 pm ] |
Post subject: | Re: LZ's progression |
Out of curiosity I tested my method: 55% winrate means ~0.178 sd distribution distance, and 1.78 sd gives 89% - no surprise here. Transitivity is OC debatable but I doubt that would be the larger effect in this case. Just retesting those 55% promotions with more games may reduce most to lower winrates. This is, afterall, how "55% for 400 games" were chosen: a statistical mass that makes it hard to pass on luck ALONE (in a few dozen tries), so new nets are at least slightly better - but nothing more. And those 400 samples are not even really independent: the first few moves, joseki choices are often identical, which further reduces the statistical weight. |
Author: | Bill Spight [ Mon Mar 18, 2019 12:19 am ] |
Post subject: | Re: LZ's progression |
There is a simple scientific point here, as well. Suppose that B beats A more often than not, and C beats B more often than not, and D beats C more often than not, etc., and we want to know how much more often, say, K beats A, our preferable method is not to try to figure it out based upon our estimates of how often B beats A and C beats B, etc., but to have A and K play against each other. Unless it is prohibitively costly or there are other reasons for not doing so. One possible reason for not doing so is that both J and K beat A almost 100% of the time, so the answer is uninteresting. But maybe how often K beats D would be interesting. We really should not be arguing about the pluses and minuses of an inferior method. |
Author: | Vargo [ Tue Mar 19, 2019 4:10 am ] |
Post subject: | Re: LZ's progression |
40 game match LZ0.16_#213 v. LZ0.16_ELFv2 at time parity (--visits=1601 for #213, --visits=3201 for ELFv2) twogtp 1.5.0, 3 duplicate games, 37 games used. Result : ELFv2 wins 19-18 The stats : Attachment: Next time, I'll use -m 20 to avoid duplicates.
|
Author: | And [ Tue Mar 19, 2019 6:16 am ] |
Post subject: | Re: LZ's progression |
several matches 25x25, nets received by board_resize.py.txt vs LZ 40x256 #205 by ChangeBoardSizeOfWeight.cpp, 10sec/move, cpuonly, gogui-twogtp: (https://github.com/leela-zero/leela-zero/issues/2240) LM 192x15 GX89 - LZ 40x256 #205 13:27 LZ 192x15 f438268e - LZ 40x256 #205 5:35 elf v2 256x20 - LZ 40x256 #205 12:28 converted minigo(25x25) 000990-cormorant works, did not test and LM 192x15 GX89(by ChangeBoardSizeOfWeight.cpp) - LM 192x15 GX89(by board_resize.py.txt) 37:3 (White 20:0) |
Author: | nbc44 [ Fri Mar 22, 2019 7:20 pm ] | ||
Post subject: | Re: LZ's progression | ||
Time parity match. LZ0.16 XXX and LZ0.16 Elfv2 2x1080ti, 60s per move. 5). #211 Code: #211 v elfv2 ( 27 games) wins black white #211 5 18.52% 2 16.67% 3 20.00% elfv2 22 81.48% 10 83.33% 12 80.00% 12 44.44% 15 55.56% 6). #213 Code: #213 v elfv2 ( 26 games) wins black white #213 12 46.15% 4 44.44% 8 47.06% elfv2 14 53.85% 5 55.56% 9 52.94% 9 34.62% 17 65.38% 7). #214 in progress...
|
Author: | Vargo [ Fri Mar 22, 2019 11:15 pm ] |
Post subject: | Re: LZ's progression |
50 game match at time parity#214 v. ELFv2 LZ0.16, twogtp 1.5.0 -v 1601 for #214 and -v 3201 for Elf, -m 20 for both. no duplicate game, no error ELFv2 wins 28-22 (56%) The games : (#214 is B in the even numbered games): Attachment: Command line and stats: |
Author: | moha [ Sat Mar 23, 2019 1:51 am ] |
Post subject: | Re: LZ's progression |
Vargo wrote: -m 20 for both This is for selfplay I think, it may be too random for matches. If you just want to avoid duplicates you could look into --randomtemp (and/or check if there are no weird edge moves).
|
Author: | Vargo [ Sat Mar 23, 2019 5:09 am ] |
Post subject: | Re: LZ's progression |
moha wrote: it may be too random for matches You're right, maybe it's too much random.I've looked at the first 20 games, there is no obviously weird move that I can see. In one game, Elf is caught in a ladder before resigning . Anyway, I'll try -m 20 --randomtemp=0.xxx |
Author: | Vargo [ Sat Mar 23, 2019 8:51 am ] |
Post subject: | Re: LZ's progression |
I've tried another 50 game match #214 v. ELF v2 Same parameters, but for -m 20 --randomtemp=0.3 Average game length and average times are almost the same as before, no duplicate. The games look "normal", but in one case (THIS GAME, n°40) , it's #214 (B) which gets caught in a ladder, and the last W moves look weird, but maybe it's because the winrate was near 100% for W. Command line and stats : Attachment:
|
Author: | Bill Spight [ Sat Mar 23, 2019 9:15 am ] |
Post subject: | Re: LZ's progression |
Vargo wrote: The games look "normal", but in one case (THIS GAME, n°40) , it's #214 (B) which gets caught in a ladder, and the last W moves look weird, but maybe it's because the winrate was near 100% for W. Maybe it has a preference for moves on the first line when the game is nearly over. |
Author: | nbc44 [ Sun Mar 24, 2019 3:20 pm ] | ||
Post subject: | Re: LZ's progression | ||
Vargo wrote: -v 1601 for #214 and -v 3201 for Elf ELFv2 wins 28-22 (56%) Full disaster: Code: The first net is worse than the second #214 v elfv2 ( 77 games) wins black white #214 26 33.77% 12 33.33% 14 34.15% elfv2 51 66.23% 24 66.67% 27 65.85% 36 46.75% 41 53.25% I think "-v 1601" is too small for l0.
|
Page 17 of 21 | All times are UTC - 8 hours [ DST ] |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |