as0770 wrote:If you use little time/playouts, you can determine the strength with little time/playouts. If you want to know the strength with much time/playouts you have to play with much time/playouts. In both cases you need the same amount of games to get a statistical significant result. Quite simple, isn't it?
OK. And in this case why You had begun this discussion, after I had written, that Your tests are "synthetic" due to unreal low number of playouts in them? (In anticipation: the answer on this question isn't quite simple for me...)
as0770 wrote:Some mean that you can measure the strength with a few games as long as the quality is good enough.
I haven't seen such claims
You can start reading at #227.
q30 wrote:
as0770 wrote:If you use little time/playouts, you can determine the strength with little time/playouts. If you want to know the strength with much time/playouts you have to play with much time/playouts. In both cases you need the same amount of games to get a statistical significant result. Quite simple, isn't it?
OK. And in this case why You had begun this discussion, after I had written, that Your tests are "synthetic" due to unreal low number of playouts in them? (In anticipation: the answer on this question isn't quite simple for me...)
What do you mean with "in this case"? I neither wrote a novelty, nor any contradiction. Seriously I am not sure if you just don't understand anything or if you are trying to fool us.
I won't start this argument once again. I played 1h and 2h games on 1-4 cores. This is not such a low number of playouts especially since you later quote games with 3000 visits. Your comment's where just disrespectful and not well-founded.
as0770 wrote:If you use little time/playouts, you can determine the strength with little time/playouts. If you want to know the strength with much time/playouts you have to play with much time/playouts. In both cases you need the same amount of games to get a statistical significant result. Quite simple, isn't it?
OK. And in this case why You had begun this discussion, after I had written, that Your tests are "synthetic" due to unreal low number of playouts in them? (In anticipation: the answer on this question isn't quite simple for me...)
What do you mean with "in this case"? I neither wrote a novelty, nor any contradiction. Seriously I am not sure if you just don't understand anything or if you are trying to fool us.
I won't start this argument once again. I played 1h and 2h games on 1-4 cores. This is not such a low number of playouts especially since you later quote games with 3000 visits. Your comment's where just disrespectful and not well-founded.
In case You understand, that You wrote... On Your once again contradiction (that You never wrote)... The game longitude must depend on number of moves: it must not be the same for 91 moves and 291 moves games. I never used visit limitation in my tests. May be I had quoted other test in some context... I limit only move time (except one additional match with limited playouts number). I never try to found on shell visits number, but due to continuously playouts output on shell, it's more conveniently to be guided on playouts number. You can see my tests playouts number for LeelaZero with different neuronets weights categories here.
Please lets calm this down. I think it is clear to some of us that at least part of the problem is with language. I'm pretty sure not all are native English speakers.
q30 wrote:On Your once again contradiction (that You never wrote)... The game longitude must depend on number of moves: it must not be the same for 91 moves and 291 moves games. I never used visit limitation in my tests. May be I had quoted other test in some context... I limit only move time (except one additional match with limited playouts number). I never try to found on shell visits number, but due to continuously playouts output on shell, it's more conveniently to be guided on playouts number. You can see my tests playouts number for LeelaZero with different neuronets weights categories here.
If one prefers a time setting of "games in 120 minutes" or "1s/move" is a matter of taste. Time in X means that an engine usually uses most of the time for the first 200 moves, and plays faster when the game is already decided. There are good reasons for both options. No need to offend someone if he does it either way.
Still I have no clue what contradiction you are talking about...
q30 wrote:On Your once again contradiction (that You never wrote)... The game longitude must depend on number of moves: it must not be the same for 91 moves and 291 moves games. I never used visit limitation in my tests. May be I had quoted other test in some context... I limit only move time (except one additional match with limited playouts number). I never try to found on shell visits number, but due to continuously playouts output on shell, it's more conveniently to be guided on playouts number. You can see my tests playouts number for LeelaZero with different neuronets weights categories here.
If one prefers a time setting of "games in 120 minutes" or "1s/move" is a matter of taste. Time in X means that an engine usually uses most of the time for the first 200 moves, and plays faster when the game is already decided. There are good reasons for both options. No need to offend someone if he does it either way.
Still I have no clue what contradiction you are talking about...
About time: 1) "Engine usually uses" - doesn't mean engines tests equivalency... 2) On 200 move the game may not be decided... So one option may be good for play with human, and other - for engines tests.
Contradiction is that 1h game on 1 core and 2h game on 4 cores couldn't be the tests for determining one rating of engines, because:
If you use little time/playouts, you can determine the strength with little time/playouts. If you want to know the strength with much time/playouts you have to play with much time/playouts.
I use translator only for translation to English of some words while I'm posting message. So, if You want, that most of people there (who have already written that they aren't native English speakers) understand You right, try to use simple unambiguous terminology without any beautiful, but superfluous words and phrases, please.
q30 wrote:I use translator only for translation to English of some words while I'm posting message. So, if You want, that most of people there (who have already written that they aren't native English speakers) understand You right, try to use simple unambiguous terminology without any beautiful, but superfluous words and phrases, please.
Since language appears to be a problem, perhaps it would help to restate what you think the other person said before offering a reply. This can help even when everybody speaks the same language.
The Adkins Principle: At some point, doesn't thinking have to go on?
— Winona Adkins
q30 wrote:I use translator only for translation to English of some words while I'm posting message. So, if You want, that most of people there (who have already written that they aren't native English speakers) understand You right, try to use simple unambiguous terminology without any beautiful, but superfluous words and phrases, please.
Since language appears to be a problem, perhaps it would help to restate what you think the other person said before offering a reply. This can help even when everybody speaks the same language. :)
OK, but if restate will be permitted in native language...
Last edited by q30 on Sat Jan 11, 2020 2:17 am, edited 1 time in total.
The best "middleweight" neuronet in 2019 year is 20b_254_784k_q, but it's obviously weaken, than "welterweight" best nets. Moreover, the 2019 year winner of "welterweight class" 15b_249a_296k_q overpowered the best of "heavyweight" net (details)...