“Decision: case of using computer assistance in League A”

Javaness2 · Post by **Javaness2** » Fri Jun 15, 2018 6:47 am

theoldway wrote:
Here is another flaw. What if we are analyzing a move that Leela suggests as good, but only after let's say 50 nodes (= 40 / 60secs without a dedicated GPU = 15secs with a standard GPU), but the accused player took only 5/10 seconds to play it?
In all analysis we've discussed so far has anyone considered every move for the same time that the accused players actually used?

This is a valid point, I don't know how many games from IGS were saved with move timestamps. On KGS I think these are there by default, but IGS...

Uberdude · Post by **Uberdude** » Fri Jun 15, 2018 6:53 am

theoldway wrote: Here is another flaw. What if we are analyzing a move that Leela suggests as good, but only after let's say 50 nodes (= 40 / 60secs without a dedicated GPU = 15secs with a standard GPU), but the accused player took only 5/10 seconds to play it?
In all analysis we've discussed so far has anyone considered every move for the same time that the accused players actually used?

Without also specifying computer power time spent on move cannot be matched with number of nodes (just proportional with unknown constant), hence

Uberdude wrote:When I did the Leela top 3 analysis a while ago I used 50k nodes per move (which takes about 45 seconds on my 5 year-old-and-no-GPU laptop) because that's what the initial investigation into Carlo's game used. I wonder where it came from, plucked out of the air/arse? It would have been a good idea to find out the specs of Carlo's computer (let's assume he didn't use a remote one) and with reference to move times in the game so the node count was plausible for cheating in that game.

If that information is not available (I think it would be reasonable for the referees to have required he provide it) then you could guess some typical laptop/desktop spec. As Carlo is doing a PhD on deep learning the assumption he doesn't have access to a remote monster computer may be false. On the other hand he did play fairly fast, and ignoring how strong the bot would be at high speed I think it would be physically difficult to use a bot at a sustained pace of e.g. 15 seconds a move, of course using a bot to find a sequence and then playing it out quickly if it goes as expected from previous analysis would be possible. But iirc Aja Huang did play as Master/Magister at 20 (or was it 30? I know Nie Weiping was 60) seconds a move, with AlphaGo I heard only using about 7 seconds a move (thanks TPUs!) to give him time to move the mouse, input moves etc...

Bojanic · Post by **Bojanic** » Fri Jun 15, 2018 7:32 am

If I am not mistaken, in PGETC sgf files there is a time stamp?

This could be number in brackets, in the end, indicating remaining time:
;B[pg]BL[1864]
;W[nj]WL[938]
;B[pk]BL[1845]
;W[pj]WL[929]
;B[qk]BL[1836]
;W[rk]WL[926]
From Metta-Ben David game
http://pandanet-igs.com/system/sgfs/637 ... 1511906173

Move 139 is on Q9, which, if I guessed correctly is B[pk], for which 21 seconds was spent.
Checking would be nice to be done here.

Charlie · Post by **Charlie** » Fri Jun 15, 2018 9:44 am

Tryss wrote:
Gobang wrote:Finally, I find it fascinating that anyone embroiled in this kind of controversy would find it appropriate to be chief referee for a major go tournament, regardless of guilt or innocence.
But I can agree to this, he should step down from this position. And as you say, that's regardless of guilt or innocence.

: Should he drown himself, like Sai of the Fujiwara?; Hikaru01-17.jpg (134.48 KiB) Viewed 10742 times

Bojanic · Post by **Bojanic** » Fri Jun 15, 2018 9:49 am

maf wrote:Bojanic, could you update your PDF such that it reflects the discussion since it was first published? I found it very hard to understand what work exactly you did and what not, and which data exactly you used (also for comparison with other players). That is very important. It would also be good to base it on a null hypothesis (i.e. no cheating) and work from that to your hypothesis that cheating in fact did occur.

Hello Maf,

here is analysis.
It consists of same data as in first analysis and following documents.

Metta analysis Upd.pdf: (926.68 KiB) Downloaded 508 times

I tried to resolve some uncertainties regarding methodology which were spotted and discussed here.
Chart with move difference is added inside, in previous version it was in following xls document.
I made some small corrections in text, in order to make it more clear.
I have mentioned other games and preliminary research, which is important to show how I got to two analyzed games.

maf wrote:It would also be good to base it on a null hypothesis (i.e. no cheating) and work from that to your hypothesis that cheating in fact did occur.

You can see easily that is the case in both versions of this analysis.

Charlie · Post by **Charlie** » Fri Jun 15, 2018 9:49 am

Bojanic wrote:This could be number in brackets, in the end, indicating remaining time:

You are, indeed, correct:
- https://www.red-bean.com/sgf/properties.html#BL
- https://www.red-bean.com/sgf/properties.html#WL

Tryss · Post by **Tryss** » Fri Jun 15, 2018 10:00 am

@Charlie : There's a difference between a "public duty" as a referee and a normal player.

If you're accused of cheating and the situation raise a big controversy, you're not in position to be a referee. Because then your decisions are weakened in the eyes of the whole go community. Or, in other words, the situation you're in will affect negatively your job as a referee.

Charlie · Post by **Charlie** » Fri Jun 15, 2018 10:37 am

Tryss wrote:There's a difference between a "public duty" as a referee and a normal player.

I completely agree. I was being facetious.

As soon as the original accusation was upheld by the referees, the damage was already done and, as I wrote very early on in this rather long thread, there ceased to be a "right" way for either Carlo or the referees or other players to act. Avoiding such situations in the future should now be our only goal.

Unfortunately, we need to extract a screw and have only a hammer. Until someone invents a screwdriver, we have to accept that removing the screw is impossible.

EDIT: Chess players have screwdrivers but they've had the advantage of decades of research. As it is, the pace of development in the Go world is smashing any records that they have set in the past.

Gobang · Post by **Gobang** » Fri Jun 15, 2018 11:19 am

Tryss wrote: Avoiding such situations in the future should now be our only goal.

Precisely. Until further notice there should be no cash prizes and no ranking awarded in online tournaments. The results of games played online should receive the merit that they deserve. No merit.

AlesCieply · Post by **AlesCieply** » Fri Jun 15, 2018 12:24 pm

I appreciate a lot the analysis Milos Bojanic performed as it provides more insights on some points. In particular, looking at the selected important moves is a good idea. However, I am not sure if the tenuki moves are well defined. The stronger the players are they are better at deciding when to tenuki and when not. Some moves may look forced for us but a much stronger player could tenuki them with ease. Are the ladder breaker moves tenuki. What about a move in an opposite corner that prevents a setting of a ladder that is hard to spot in just one specific variation of a sequence played elsewhere. I guess we have to wait for stronger bots available to the public that would help us to establish what moves to pick as the important ones to judge how strong the player is or if he/she cheated by using AI.

On the computer Carlo Metta might have used in his PGETC games. In the Italian appeal they specify what computer they performed their counter-analysis on: Intel Core i7, 2.60 GHZ, RAM 16GB, GPU NVIDIA GeForce GTX 960M. Operating system - Windows 10. They also say there that it analyses about 100k nodes in about 30s. I asked the question what computer Carlo used in his PGETC games and the answer I got (from Mirco Fanti, the Italian team captain, as he insisted any questions should not be asked Carlo directly but should go through him) that it was the one used in the analysis. I conclude from this that most (in not all) of the Italian counter-analysis was done by Carlo himself. This is fine, though it raises the question why the Shakhov-Metta game provided "presumably by mistake" was not excluded from the analysis, why the mistake was not discovered at that point.

The discussion here is mostly about analyses and how difficult it is to do them properly and to get reliable results from them. For me, the most relevant evidence remains the Shakhov-Metta game. The Italian explanation raises only more questions (besides the one mentioned above):
- if someone wants to show his game at a go club, would he use his laptop or present the game on a real go board; I find the later more likely, though do not refute the first option
- if someone shows his game to a public/friends on his laptop I guess there would be questions asked (why not this move, should you or your opponent not play another sequence) and there would be variations in the game record; there are none in the one provided by Carlo
- and finally and most importantly (and already mentioned in this lengthy topic), why would anyone show the game with a board turned by 180 degrees (from white's point of view when the player saw it from black's point of view while playing the game on KGS)

And a little bit from the analysis I presented that I do not remember discussed here. I can very well understand that Carlo had a bad tournament at WAGC and underperformed there for whatever reasons. What I do not understand (if he did not cheat on internet) is that in his regular games (including WAGC) he finds moves better than Leela's top choice more often (almost 3 times) than in his PGETC games where he presumably used AI, so he was not trying hard to find the best moves. Sure, the number of such moves is quite small, so the effect may disappear when more games would be considered, but I still find this interesting.

Bill Spight · Post by **Bill Spight** » Fri Jun 15, 2018 12:33 pm

Tryss wrote:There's a difference between a "public duty" as a referee and a normal player.

Charlie wrote: As soon as the original accusation was upheld by the referees, the damage was already done and, as I wrote very early on in this rather long thread, there ceased to be a "right" way for either Carlo or the referees or other players to act. Avoiding such situations in the future should now be our only goal.

I don't know about the only, but certainly it should be our goal.

Chess players have . . . had the advantage of decades of research {in detecting cheating.}

We can learn from them.

But much remains to be done. A big step will be to get away from the fixation on matching plays with choices of bots, which is weak evidence. Chess players are able to rate the difficulty of individual plays, which enables them to eliminate easy plays from consideration. Currently we have to rely upon humans to do that, despite differences in judgement between players. Bojanic does a service by identifying forcing plays and tenuki. One foot in front of the other.

Bill Spight · Post by **Bill Spight** » Fri Jun 15, 2018 1:08 pm

AlesCieply wrote:I appreciate a lot the analysis Milos Bojanic performed as it provides more insights on some points. In particular, looking at the selected important moves is a good idea. However, I am not sure if the tenuki moves are well defined. The stronger the players are they are better at deciding when to tenuki and when not. Some moves may look forced for us but a much stronger player could tenuki them with ease. Are the ladder breaker moves tenuki. What about a move in an opposite corner that prevents a setting of a ladder that is hard to spot in just one specific variation of a sequence played elsewhere. I guess we have to wait for stronger bots available to the public that would help us to establish what moves to pick as the important ones to judge how strong the player is or if he/she cheated by using AI.

It is not desirable to wait for stronger bots or for bots which have been developed to evaluate the difficulty of plays in order to have them sort important plays from unimportant plays. A practical approach is to have a panel of strong players to do that, if need be. You can let them work independently, and then identify plays upon which they agree, or let them consult with each other and come up with some choices.

In addition, much research needs to be done in detecting cheating at go. We do not necessarily need to follow the exact path taken in chess. lightvector's work developing neural nets to model players of different strengths (see viewtopic.php?t=15757 ) might be a place to start with rating the difficulty of plays, for instance.

Another thing that might be done is to create a database of games where one player is known to have cheated. (Because he or she did so by request, in order to add a game to the database.) You probably would want to have a database in which the same players did not cheat, for comparison purposes.

maf · Post by **maf** » Fri Jun 15, 2018 1:20 pm

Bojanic wrote:here is analysis.

Thank you. I'm sorry that I repeat myself, but I'm still looking for many games by other players checked in the same way, to see if they exhibited that same pattern as the games by CM. Earlier, I gave some naive ideas why that's a possibility. The next step would be to see just how outstanding this truly is. Unless we're sure he's an exception, the whole approach is, as far as I can tell, just not reliable. If that's not currently possible for you, that is fine, it should just be noted.

Jan.van.Rongen · Post by **Jan.van.Rongen** » Fri Jun 15, 2018 1:56 pm

After move 139 the black advantage Leela 0.11 estimates is 70%, but AQ finds it only 55%. It is white's last chance to overturn the game, and AQ indeed can win this position with white against Leela 0.11. That means AQ's evaluation of this position is probably more accurate. AQ vs AQ is not interesting at all in this discussion. It is about a different evaluation. AQ also thinks that white 70 was a big mistake (cutting off two stones is too small).

Calling 59 and 65 in game 2 "tenuki" moves is beyond me. They are almost forced answers. Especially 65 is an "only" move. Then 87 is again not a tenuki but a direct answer to 86 into a double sente area. After that white plays slack and black wins about 10 points in that area. In sente. 97 is not a tenuki, black had sente and the area in the lowerright was finished for the time being.

Then some here (correctly) pointed to the time record in the SGF. I already did the analysis. This record shows that black in this game played a lot faster than white, whcih might make one suspicicious, BUT a lot of moves were made so fast that there was simply not enough time to transfer the white move to Leela 0.11 and wait for a reasonable analysis from that engine. 80% of the black moves were fast moves.

Bojanic · Post by **Bojanic** » Fri Jun 15, 2018 2:09 pm

maf wrote:Thank you. I'm sorry that I repeat myself, but I'm still looking for many games by other players checked in the same way, to see if they exhibited that same pattern as the games by CM. Earlier, I gave some naive ideas why that's a possibility. The next step would be to see just how outstanding this truly is. Unless we're sure he's an exception, the whole approach is, as far as I can tell, just not reliable. If that's not currently possible for you, that is fine, it should just be noted.

Maf,
it is written in paper and explained earlier, here it is again:
in preliminary analysis I examined all 180 games from A league, with quick GRP settings.
Less than 10 games were suspiciously similar to Leela, and few after quick analysis showed differences and were discarded (one Metta's, btw).
Only few games had very similar moves to Leela's - and two of them were Metta's. It is very clear why he stands out most.
Also I consider his games easiest for analysis, because they are most similar to Leela's.

BTW I started analysis because of other player. His game is very similar to Leela, but some of the moves are different, and he is stronger. Overall, it is more difficult case.

Life In 19x19

“Decision: case of using computer assistance in League A”

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A