Make me a profile! (Rate my opening, middle, endgame)

negapesuo · Post by **negapesuo** » Sun Nov 10, 2019 8:02 pm

In the KGS link are the last 10 games that I've played, so that I'm not being selective about my games (more games played in previous months if interested). I just got to 2k in KGS.

https://www.gokgs.com/gameArchives.jsp?user=negapesuo

If anyone would like to give me an overall assessment of my strength in the following criteria and make me a strength profile, I'd highly appreciate it! I thought this could me a new interesting and fun way of self-assessment.

Opening
Middle game
Endgame
Reading
"Intuition"

I welcome anyone and everyone's opinion. Thank you!

Bill Spight · Post by **Bill Spight** » Sun Nov 10, 2019 9:58 pm

I don't know about rating you with any precision. Approaching shodan seems right, though.

Some advice.

Opening:

Corners first. In particular, occupy the last open corner unless you have a very good reason not to.

Don’t try to pincer strong groups. Wait to extend on the sides. Corners first, including enclosures from 4-4 points.

As a rule, wait to pincer.

All stages of the game:

Don’t strengthen your opponent’s stones if you can help it.

Opening and middle game:

Learn about shape. Make good shape, avoid bad shape, keep your opponent from making good shape, force him to make bad shape, if you can. Shape is not particularly important in the endgame.

Good luck!

xela · Post by **xela** » Mon Nov 11, 2019 12:25 am

So I have good news and bad news for you.

The bad news: I'm not going to do ten game reviews all at once, that's asking a bit too much! (Bill is clearly more generous than me.)

The good news: I've been thinking recently about how we can turn Leela Zero into a more effective tutor. You've popped up at the right time for me to use you as a case study :-)

Here's the plan:

Opening/middlegame/endgame: the easy part. LZ gives winrates for each move. Let's say that a move is "good" if (winrate of your move)/(winrate of LZ's preferred move) is 0.7 or more. And let's define "opening" as moves 1-30, "middlegame" as moves 31-120 and "endgame" as the rest. Then we can count percentage of good moves per game phase.

Why am I using a ratio rather than a winrate drop? Other discussions around here tend to focus on "this move was -5%" and "this blunder was -20%". The issue is that amateur games can have big swings: just because your winrate is down to 10%, doesn't mean it's over. But if we say "mistake = winrate goes down 15%", then from that 10% position literally every move will be defined as a good move. I don't think that's what we're looking for.

Reading versus intuition is harder to capture, but I think we can do it. LZ calculates a policy value for each move, then uses Monte Carlo tree search (MCTS) to do a more in-depth evaluation. So let's say that policy=intuition and MCTS=reading. And let's say that a policy value of 20% or more is a "high policy" move.

If you played a high policy move that turns out to be a mistake, then your intuition was probably OK, so we'll call that one a reading error.
If you played a low policy move, and LZ's preferred move is a high policy move, then it looks like your intuition was off.
If you played a low policy move, and LZ's preferred move is a different low policy move, then we can't say whether the error was based on reading and intuition.

.

Now if I can turn these rules into Python code inside an hour, then I can leave my computer running overnight and we'll have your profile numbers tomorrow. If the coding turns out to be more complicated than I think (or if I get distracted along the way), you might be waiting a few days (or someone else might jump in and implement this).

I'll get back to you soon. Maybe.

jlt · Post by **jlt** » Mon Nov 11, 2019 7:58 am

xela wrote:Opening/middlegame/endgame: the easy part. LZ gives winrates for each move. Let's say that a move is "good" if (winrate of your move)/(winrate of LZ's preferred move) is 0.7 or more. And let's define "opening" as moves 1-30, "middlegame" as moves 31-120 and "endgame" as the rest. Then we can count percentage of good moves per game phase.

The method sounds promising. I tried to do that manually on my last KGS game (3k vs. 4k), and detected 6 bad moves by Black. However, all endgame moves were considered as "good" because Black's winrate stayed over 90%, although many mistakes were made. In more detail (hidden beause this thread is about negapesuo's games and not mine):

Kirby · Post by **Kirby** » Mon Nov 11, 2019 9:00 am

I understand your desire to want a "profile" of your play. However, I think it's most beneficial to focus on two categories of models if you're interested in a review:

1. Actual moves you played
2. The reason you played the moves you did (it was joseki, you were scared, you felt tired, you were angry and wanted to kill his group, etc.)

To analyze #1, I don't think you'll get better than game reviews. Each board position is different, so putting games up for review.

To analyze #2, the reviewer needs to know your reasoning or state of mind behind the moves that you played. We can try to find patterns in a handful of game records, but it's difficult to narrow down what you were thinking for sure.

Given this, I would recommend one of the following:
a. Put up a single game for review, and ask about the *biggest* mistakes - you'll get more direct feedback, because it's easier for someone on L19 to look at
b. If you'd prefer, analyze your game with a bot like LZ or KataGo

From that review, you can aim to find out what your biggest blunders were. And then you can try to recall - what were you thinking or feeling at that point in the game? Why did your move selection process bring about something different? Was it an oversight? A reading error? Or did you have reasoning for the move you played? Express that reasoning, and we can try to understand if there's a problem in judgment.

The bottom line is, I think it's very informative to understand if there are problems in one's judgment or move selection process. To do that, it's helpful to have more than just the moves.

Tryss · Post by **Tryss** » Mon Nov 11, 2019 9:47 am

jlt wrote:The method sounds promising. I tried to do that manually on my last KGS game (3k vs. 4k), and detected 6 bad moves by Black. However, all endgame moves were considered as "good" because Black's winrate stayed over 90%, although many mistakes were made.

Katago help with this. Here is one graph of one of my game :

: Katago Review.png (105.33 KiB) Viewed 14720 times

Mistakes made when over 90% wr are still visible.

Bill Spight · Post by **Bill Spight** » Mon Nov 11, 2019 10:04 am

Tryss wrote:
jlt wrote:The method sounds promising. I tried to do that manually on my last KGS game (3k vs. 4k), and detected 6 bad moves by Black. However, all endgame moves were considered as "good" because Black's winrate stayed over 90%, although many mistakes were made.
Katago help with this. Here is one graph of one of my game :

Katago Review.png
Mistakes made when over 90% wr are still visible.

One thing that I find interesting, which I have seen in other KataGo graphs, is how closely the percentages and board points track each other about halfway through the game.

jlt · Post by **jlt** » Mon Nov 11, 2019 12:28 pm

Thanks for the suggestion. Katago and Leelazero generally agree on which moves are bad, but the measure of badness is a bit different so Katago gives a different perspective.

For some stupid reason I thought that Katago couldn't work on my computer which doesn't have a GPU, but actually it does work...

xela · Post by **xela** » Tue Nov 12, 2019 3:28 am

xela wrote:...if I can turn these rules into Python code inside an hour, then I can leave my computer running overnight and we'll have your profile numbers tomorrow.

Nearly there. Python interprocess communication turned out to be trickier than I thought, so I did it the ugly way (ask LZ to write to a log file and then read it back in). Code here for anyone who wants to play with it. It's not a model of beautiful code, but so far it seems to get the job done.

Should take about 5 1/2 hours compute time for 10 games, so if it doesn't crash while I'm asleep then I'll have some numbers only a day later than first advertised.

xela · Post by **xela** » Tue Nov 12, 2019 7:06 pm

OK, I have some numbers for you now.

After looking at the numbers from my first round of tesing, I changed the threshold from 0.7 to 0.8, i.e. a move is counted as "good" if your winrate is at least 80% of LZ's winrate. With the lower number, there weren't enough "mistakes" to make the patterns stand out.

I used LZ network number 157 with 10,000 playouts per move, about 5 hours of computing time (using a smaller, older network so that we can get a decent number of playouts in that time).

By these criteria:

99% of your opening moves are good
76% of your middlegame moves are good
50% of your endgame moves are good
47% of your errors are due to intuition
22% of your errors are due to reading

Detailed breakdown:

Code: Select all

           | intuition |   reading | can't say | count of all moves played 
-----------+-----------+-----------+-----------+-------------------------- 
   opening |         1 |         0 |         1 |       150 
middlegame |        47 |        22 |        31 |       424 
   endgame |        97 |        54 |        52 |       402

Full results:

Negapesuo_autoprofile-2019-11-13.csv: (16.49 KiB) Downloaded 577 times

Of course it's possible that there are bugs in my software, and there's plenty of room for improving the methodology here, so don't take this too seriously! For a start, we probably want different thresholds for different game phases as per jlt's suggestion. And using KataGo to account for score difference as well as winrate might be a good idea.

Technical note: I did a few spot checks on some of the moves, and the numbers in my analysis for policy values are different from what Lizzie shows when you press the "show policy" button, although they're about the same relative sizes. I'm not sure what's going on here. My numbers are from the GTP console output when you ask LZ to generate a move.

negapesuo · Post by **negapesuo** » Tue Nov 12, 2019 9:19 pm

Woah what just happened? haha. This is awesome! thank you so much!

Now I'm interested more in your code rather than my profile haha. I'm ganna check it out this weekend!

*Looks like my endgame is beyond poor, and I play better when I read according to your analysis.

xela wrote:OK, I have some numbers for you now.

After looking at the numbers from my first round of tesing, I changed the threshold from 0.7 to 0.8, i.e. a move is counted as "good" if your winrate is at least 80% of LZ's winrate. With the lower number, there weren't enough "mistakes" to make the patterns stand out.

I used LZ network number 157 with 10,000 playouts per move, about 5 hours of computing time (using a smaller, older network so that we can get a decent number of playouts in that time).

By these criteria:

99% of your opening moves are good

76% of your middlegame moves are good

50% of your endgame moves are good

47% of your errors are due to intuition

22% of your errors are due to reading
Detailed breakdown:
Code: Select all
           | intuition |   reading | can't say | count of all moves played 
-----------+-----------+-----------+-----------+-------------------------- 
   opening |         1 |         0 |         1 |       150 
middlegame |        47 |        22 |        31 |       424 
   endgame |        97 |        54 |        52 |       402 
Full results:
Negapesuo_autoprofile-2019-11-13.csv
Of course it's possible that there are bugs in my software, and there's plenty of room for improving the methodology here, so don't take this too seriously! For a start, we probably want different thresholds for different game phases as per jlt's suggestion. And using KataGo to account for score difference as well as winrate might be a good idea.

Technical note: I did a few spot checks on some of the moves, and the numbers in my analysis for policy values are different from what Lizzie shows when you press the "show policy" button, although they're about the same relative sizes. I'm not sure what's going on here. My numbers are from the GTP console output when you ask LZ to generate a move.

Knotwilg · Post by **Knotwilg** » Wed Nov 13, 2019 2:45 am

Xela, first of all, awesome effort!

Critical by nature, my first remark is that we should callibrate these results against a sample of others. Maybe we are all better in the opening than in the endgame by these standards?

jlt · Post by **jlt** » Wed Nov 13, 2019 4:10 am

This very bad opening move loses

8% according to Leelazero 157 with 1000 playouts
14% according to 15-block Leelazero trained on 40 blocks
3 points according to Katago.

So the criterion "a move is counted as 'good' if your winrate is at least 80% of LZ's winrate" should probably be changed for opening moves, I imagine that 85% or 90% instead of 80% would be more appropriate.

Click Here To Show Diagram Code: [go]$$B $$ +---------------------------------------+ $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . 5 . . . . . . . . . . . . . . . . | $$ | . . . 2 . . . . . , . . . . . 1 . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . 6 . . . . . , . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . 4 . . . . . , . . . . . 3 . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ +---------------------------------------+[/go]

This awful move loses even less:

Click Here To Show Diagram Code: [go]$$B $$ +---------------------------------------+ $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . 7 8 . 5 . 6 . . . . . . . . . . . | $$ | . . 9 4 . . . . . , . . . . . 1 . . . | $$ | . . 0 . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . 2 . . . . . , . . . . . 3 . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ +---------------------------------------+[/go]

P.S. Using winrates to evaluate endgame mistakes may be problematic. When the game is very close, 2-point mistakes can induce huge winrate swings.

Bki · Post by **Bki** » Wed Nov 13, 2019 10:44 am

jlt wrote:This awful move loses even less:

$$B
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . 7 8 . 5 . 6 . . . . . . . . . . . |
$$ | . . 9 4 . . . . . , . . . . . 1 . . . |
$$ | . . 0 . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . 2 . . . . . , . . . . . 3 . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+
Click Here To Show Diagram Code
[go]$$B $$ +---------------------------------------+ $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . 7 8 . 5 . 6 . . . . . . . . . . . | $$ | . . 9 4 . . . . . , . . . . . 1 . . . | $$ | . . 0 . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . 2 . . . . . , . . . . . 3 . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ +---------------------------------------+[/go]

After looking at this situation quickly with lz, it seems to be because it's not actually an "awful mistake". Leela seems to prefer to counter this hane by the hane on the second line, after which the situation revert to the usual joseki with white playing the bad push at

. Which is definitely a mistake, but usually not a game ending one, so a few percentage points seems correct. Leela seems to think (in this position at least) that the "refutation" I (and most other too probably) to be a mistake that leave white ahead when white doesn't try to prevent the connection with the approach stone and instead use the opportunity to get influence facing the left side.

This hane is only a game losing mistake in DDK/weak SDK games because they thought the move worked and thus unreasonably try to seal black in, which end up in losing either of the

or

-:w8: stones.

Edit: Actually looking at it again, it still cost white some 5-10% winrate, which is significant.

xela · Post by **xela** » Thu Nov 14, 2019 2:52 pm

jlt wrote:So the criterion "a move is counted as 'good' if your winrate is at least 80% of LZ's winrate" should probably be changed for opening moves, I imagine that 85% or 90% instead of 80% would be more appropriate.

Well, I'm definitely open to the idea that different criteria apply to different phases of the game. But at the same time, I'm wondering: is the goal to play good moves in the sense that a pro might look over your shoulder, nod approvingly and say "I like your style"? Or is the goal to play moves that are likely to win the game? I suspect that for most of us amateurs, the game is really won or lost in the middlegame most of the time. We try to play good opening moves because it feels good, but actually "not awful" is all we need out of the opening.

I'm remembering the shock of the early 2000s when most western players had learned mainly from Japanese books and played online mainly against Japanese players on IGS, and then the Dashn server opened up the chance to play against Korean amateurs. So there were all these SDK games that started with "look at all the bad shape and all the direction of play mistakes in the opening, this will be an easy win" and ended up with "OMG this guy can FIGHT!!!" We learned that a lead in the opening didn't guarantee a win by any means.

Life In 19x19

Make me a profile! (Rate my opening, middle, endgame)

Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)

Re: Make me a profile! (Rate my opening, middle, endgame)