Bot strength

For discussing go computing, software announcements, etc.
Post Reply
User avatar
wineandgolover
Lives in sente
Posts: 866
Joined: Sun Jul 25, 2010 6:05 am
GD Posts: 0
Has thanked: 318 times
Been thanked: 345 times

Bot strength

Post by wineandgolover »

Has anyone found a convincing way to put AlphaGo Lee, AlphaGo Master, and AlphaGo Zero in a hierarchy with KataGo, Golaxy, and Fineart?

I remember reading last year that Golaxy's developers were very confident they and a FineArt had surpassed AlphaGo, though I’m not sure how they’d know. I guess they could review AlphaGo's self-plays and try to find positive surprises and mistakes.

How confident are we that KataGo has or has not surpassed AlphaGo?
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: Bot strength

Post by gennan »

KataGo may have surpassed AlphaGo under equal conditions (millions of playouts per move). But a vast majority of KataGo users don't have the hardware to support such high number of playouts.
If we say that KataGo is stronger than AlphaGo, many may assume that KataGo on their mediocre laptop with only 1000 playouts per move is stronger than AlphaGo with millions of playouts per move and this may not be true.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: Bot strength

Post by Uberdude »

Even with only tens of thousands of playouts, I think LeelaZero and KataGo are stronger than AlphaGo Lee during its match. I say this without solid proof, but my evidence is reviewing those games and where they differ the given sequence seem convincing reasons. Also stronger versions of AlphaGo identified AGLee making mistakes e.g the joseki shock peep in game 2 was an overplay and bad if Lee resisted which both AG teaching tool and LZ agree on. Also there are similarities in the preferences of the bots as they evolved and LZ used to like hanging connection in high approach to 3-4 but doesn't anymore (because it's not sente, solid is) and AG Zero has same preference so the fact AG Lee plays it is further evidence it's weaker and not so far along the evolution path.
RobertJasiek
Judan
Posts: 6273
Joined: Tue Apr 27, 2010 8:54 pm
GD Posts: 0
Been thanked: 797 times
Contact:

Re: Bot strength

Post by RobertJasiek »

gennan wrote:KataGo may have surpassed AlphaGo under equal conditions (millions of playouts per move).
Roughly what hardware and how much thinking time per move do allow millions of playouts per move, IYO?
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: Bot strength

Post by And »

gennan wrote:KataGo may have surpassed AlphaGo under equal conditions (millions of playouts per move). But a vast majority of KataGo users don't have the hardware to support such high number of playouts.
If we say that KataGo is stronger than AlphaGo, many may assume that KataGo on their mediocre laptop with only 1000 playouts per move is stronger than AlphaGo with millions of playouts per move and this may not be true.
also "vast majority users don't have" the opportunity to test the AlphaGo! what do you compare with what? :)
jann
Lives in gote
Posts: 445
Joined: Tue May 14, 2019 8:00 pm
GD Posts: 0
Been thanked: 37 times

Re: Bot strength

Post by jann »

According to DeepMind the strongest version of AlphaGo was AlphaGo Zero 40b.

It is very likely that even KataGo surpassed its strength by now (on hw parity), since AGZ worked without liberty and ladder input, which should definitely amount to a noticeable bonus (effective net size increase) when present. Not being score-blind should also give strength increase (training with aux input-output gives stronger results even when the aux part is not used later).

FineArt and Golaxy is likely even further ahead at the moment. OC, the practical question is hardware.
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: Bot strength

Post by gennan »

RobertJasiek wrote:
gennan wrote:KataGo may have surpassed AlphaGo under equal conditions (millions of playouts per move).
Roughly what hardware and how much thinking time per move do allow millions of playouts per move, IYO?
I'm no expert. I only know some anecdotes:

In August 2020 @goame reported he got roughly 100k playouts per minute with KataGo 40-block 384 channel network running on 2x RTX2080 Ti and 64 GB RAM.

In 2017 DeepMind made their AlphaGo teaching tool (an opening database) and it seems they got roughly 1M playouts per minute with AlphaGo Master running on their hardware. I don't know what that was, perhaps 4 TPUs? It must have been pretty powerful.
User avatar
wineandgolover
Lives in sente
Posts: 866
Joined: Sun Jul 25, 2010 6:05 am
GD Posts: 0
Has thanked: 318 times
Been thanked: 345 times

Re: Bot strength

Post by wineandgolover »

7594EA93-8927-4F49-9FEE-7ECE2D6BB862.jpeg
7594EA93-8927-4F49-9FEE-7ECE2D6BB862.jpeg (77.44 KiB) Viewed 10685 times
I see that somebody on reddit tried to answer this (not rigorously) a couple of months ago.

https://www.reddit.com/r/baduk/comments ... g_for_ais/
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: Bot strength

Post by gennan »

I saw that post too, but it looks like the absolute Elo ratings used there have no relation to other go rating systems. Only the relative Elo ratings may have some meaning, but the meaning is not much more than a simple ranking IMO (ordering the list by strength).
Mike Novack
Lives in sente
Posts: 1045
Joined: Mon Aug 09, 2010 9:36 am
GD Posts: 0
Been thanked: 182 times

Re: Bot strength

Post by Mike Novack »

Not only failing to report at what "number of visits" but also "real time" (time control)

It is not just equality of visits that matters (if that measure used) because the number chosen might be before a "knee" fdor one but not another.

And of course "real time" is the true/correct measure since that can change number of visits and not necessarily equally. I consider "equal real time" to be the correct measure, since go is played with time controls. If we want to compare to human players, that must be a speeds used for human go. If asking whether a program is up the strength of a top 9p that time control should be what might be used for a top pro title challenge game. Say a minute/move.
User avatar
wineandgolover
Lives in sente
Posts: 866
Joined: Sun Jul 25, 2010 6:05 am
GD Posts: 0
Has thanked: 318 times
Been thanked: 345 times

Re: Bot strength

Post by wineandgolover »

Regarding the table above, the poster did make clear it was just for fun. He did the best he could, given the lack of direct comparison. I was shocked when lightvector published his final ELO rating comparisons to prior versions and to LZ and Elf. The table incorporates these real world comparisons. Where do you think AG fits in?

I get the equal time argument, but it would be easier to defend equal time and equivalent hardware. After all, AG Lee used tons of TPU's (playouts) to overcome its relative weakness. Even after AG-Lee, Deepmind used 4TPU's which is super-fast. FineArt supposedly uses hundreds of gpu's.

IMHO, to understand the true strength of the bot, you shouldn’t handicap with time. Let FineArt use all their GPU's but let fat-Katago do so too. Or give KataGo all the time it wants.

Finally and separately, the closed bots have a big advantage. They have access to the best open bots. I remember rumors that within weeks of the recent g-170 katago release, at least one closed bot started playing a sequence it hadn’t played before, one that katago favored. It makes sense they would be strongest. Please note that I haven’t confirmed these rumors. I’d love to see some evidence!
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders
Post Reply