as0770 wrote:...The difference between ponder on and ponder off matches is nearly irrelevant compared to all the other influences and for sure interfered by the statistical fluctuation when rating the engines with less than 100 games.
For same strength engines sparring ponder is significant parameter...
But most significant (for the game strength) difference of these synthetic tests and real prof. games is in the time control parameter values.
Re: Engine Tournament
Posted: Sun Dec 10, 2017 9:29 am
by as0770
q30 wrote:For same strength engines sparring ponder is significant parameter...
Based on what tests? I played thousands of computer AI games, even if one engine is pondering, and one not, you need hundrets of games to see the difference.
Re: Engine Tournament
Posted: Sat Dec 16, 2017 1:25 am
by q30
as0770 wrote:
Based on what tests? ...
On tests with close to real game parameters (2'' on move, for example).
Re: Engine Tournament
Posted: Sun Dec 17, 2017 11:56 pm
by as0770
q30 wrote:
as0770 wrote:
Based on what tests? ...
On tests with close to real game parameters (2'' on move, for example).
In chess doubling the calculating time will make Engines stronger by 60 ELO Points. Engines of the same strength have about 50% ponderhits. So the difference of a pondering Engine to a non pondering Engine is about 30 ELO. You need more than 1000 games to measure an ELO difference of 30 ELO. And we are talking about ponder vs. no ponder. In Go the difference is even smaller because there are less ponderhits. Also the ELO gap between engines and their pondering ELO gain is much less 30 ELO. So after all it is simple impossible to meassure a difference in the ELO gain with pomdering with such a small amount of games.
Re: Engine Tournament
Posted: Sat Dec 23, 2017 12:51 am
by q30
as0770 wrote:
In chess doubling the calculating time will make Engines stronger by 60 ELO Points...
It's not absolutely linear strength(time) dependency. So strength increasing value by doubling time will depend on absolute time value.
Pondering may not affect only on simple MC engines, such MoGo, where, for example, increasing by "--earlyCut 0" thinking time doesn't make the engine game stronger.
I'll test pondering effect on MoGo, Pachi, Ray and Leela soon.
Re: Engine Tournament
Posted: Tue Dec 26, 2017 8:34 am
by as0770
q30 wrote:
as0770 wrote:
In chess doubling the calculating time will make Engines stronger by 60 ELO Points...
It's not absolutely linear strength(time) dependency. So strength increasing value by doubling time will depend on absolute time value
30 years of statistics in computerchess say something different.
Re: Engine Tournament
Posted: Wed Dec 27, 2017 9:37 am
by Cyan
Leela Zero is much stronger now, can you test it again please?
Re: Engine Tournament
Posted: Wed Dec 27, 2017 10:06 am
by as0770
Cyan wrote:Leela Zero is much stronger now, can you test it again please?
I'd love to do so, but I am on vacations I'll be back in 1-2 weeks.
Re: Engine Tournament
Posted: Wed Dec 27, 2017 1:03 pm
by lightvector
as0770 wrote:
q30 wrote:
as0770 wrote:
In chess doubling the calculating time will make Engines stronger by 60 ELO Points...
It's not absolutely linear strength(time) dependency. So strength increasing value by doubling time will depend on absolute time value
30 years of statistics in computerchess say something different.
Are you sure? I'm pretty sure I recall more than one case of computer chess statistics, one informally posted on a forum, and one from some published paper, indicating slightly sublinear elo gains with log(time). Possibly other attempts I didn't see got more linear results, perhaps it depends a little on the engine and perhaps you only see it if you test a wide enough range.
The order of magnitude differences were like a +35 elo difference for a given time multiplication factor becoming a +25 elo difference for that time multiplication factor between the ends of a range that was 5 or 6 orders of magnitude wide, or something like that (those numbers are all made up, I'm just trying to convey the rough scale of things that I fuzzily recall). So, not a big difference, but still a bit nonlinear. Unless I just made up those memories.
Re: Engine Tournament
Posted: Thu Dec 28, 2017 10:12 am
by as0770
lightvector wrote:Are you sure? I'm pretty sure I recall more than one case of computer chess statistics, one informally posted on a forum, and one from some published paper, indicating slightly sublinear elo gains with log(time). Possibly other attempts I didn't see got more linear results, perhaps it depends a little on the engine and perhaps you only see it if you test a wide enough range.
The order of magnitude differences were like a +35 elo difference for a given time multiplication factor becoming a +25 elo difference for that time multiplication factor between the ends of a range that was 5 or 6 orders of magnitude wide, or something like that (those numbers are all made up, I'm just trying to convey the rough scale of things that I fuzzily recall). So, not a big difference, but still a bit nonlinear. Unless I just made up those memories.
Indeed this is true. I defalcated that with faster hardware or longer timecontrol there is a slight decrase in the ELO gain. But we are talking about a decrase from a 70 ELO gain on an 286 30 years ago to a 60, maybe 50 ELO gain nowadays. This has something to do with the increasing amount of draws with nearly perfect play.
Re: Engine Tournament
Posted: Fri Dec 29, 2017 8:15 am
by q30
The results (with pondering - without pondering):
MoGo 3 - 1;
Pachi 3 - 1;
Ray 3 - 1;
Leela 3 - 1; in all 12 - 4(details).
I don't know, what about quantitatively results (in ELO), but definitely there is qualitative effect, and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.
Re: Engine Tournament
Posted: Mon Jan 01, 2018 4:29 pm
by as0770
q30 wrote:... and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.
So we agree that the question is only relevant in matches between engines where one is able to ponder and the other engine is not? Fine. In your ratinglist Hiratuka is the only engine that does not ponder. Once you had DarkGo which moves instandly, Hira is limited to one minute. So for both the question of the absolute timecontrol and CPU power is much more relevant than the question of running it in ponder on or ponder off matches. And still our results are similar to equal. So where is your point always claiming others as "synthetic" results? Your testing does not become more precious by depreciating others. For me it does not make any sense to test engines made for GPU support without GPU, so I have to play ponder off to get realistic results.
Re: Engine Tournament
Posted: Wed Jan 03, 2018 6:48 am
by as0770
Finally a Leela Zero update. v0.9 with the network file from 2018.1.1 makes it into League B and is now stronger than the human trained version 0.1 which was placed 6th in League B with 15 points against the same opponents.