John Fairbairn wrote:I obviously don't know how you collated the raw data, but I have an inkling that people like Go Seigen would be affected because they played so many games with no komi.
I didn't make any attempt to adjust for komi, I just hoped people played approximately as many games as black and white in those days. I suppose the best thing to do would be to offer a bias based on true win%, so if black wins 60% on no komi, two equal players have that same value. I may go back and try this if I find time. I could even try something similar for handicap games, since they were more common in the old days I think.
Laman wrote:i think i would be a bit more inclined to believe results based on regular during time changing ELO (or GoR, as it should be pretty similar and i am more familiar with it) and taking maximum achieved value for each player, like in
progor (data to 2008 included, then unfortunately discontinued).
Well, first I'd note those lists are actually pretty similar: all of his top 10 are in my top 25 except Nie Weiping. I chose specifically to avoid models with fiddly parameters that would need tuning, such as ELO or GoR. Not that they're bad necessarily, but I would need to spend time to actually tune those parameters (or trust the values chosen by others, which I'd rather not do!)
Laman wrote:by the way, i am not really sure what i am aiming at with this, but how much would positions (of older players) move if you made 'time-slices' by 5 (or 2 or n) years? i mean including only games up to some date, like to 2010, to 2005, 2000 and so on.
I actually think this is an interesting idea, and easy to try. The problem is that the numbers from different slices would no longer be directly comperable, but it might still be interesting for its own sake.
John Fairbairn wrote:Obviously we have collected games of the most famous/successful players as these are more generally the most interesting/available, but I wonder if there is also the possibility, in the way the algorithm works, that a well-represented player who wins 10 games and scores 1 point each against moderate opponents can creep ahead of a "better" but under-represented player who wins 2 games and scores 4 points each against top opponents.
This is certainly possible, but it's a quite difficult bias to remove. More wins means more evidence of a players' strength. When Han Taehee upset Yi Changho, it was certainly interesting, but it didn't mean he deserved to be at the top of the ratings. The model rewards players who win consistently, as I think any good model should. One option might be to build a second "confidence" parameter, that says "I think player X is really great, but my confidence in this assertion is low due to lack of data". Hmm, I'd have to think about how to implement such a thing.
Kirby wrote:Disagreement isn't pointless. Although, I wonder if "best player" ranking lists are.
Rather than a "best player" ranking list, I would prefer a list of players sorted by characteristics such as “most wins in 2011”, or “longest winning streak from year X to year Y”, etc. Such statistics are objectively measurable, whereas characteristics such as “best player” or “coolest playing style” are not.
Now that I've pulled the data, I can certainly make such lists if you'd like

. Maybe that can be a seperate post.
From a statistics perspective, I'm actually trying to answer a very specific (if ultimately unanswerable) question: if any two players play against each other, which is more likely to win? And it is actually objectively measurable, in a way: I could take the data from 2000-2004, train the model, then see how well it predicts the results of the matches from 2004-2009, for example. This may be the next thing I do, actually.
As for coolest playing style, I can only say that in my opinion winning is very cool.
Harleqin wrote:I think that you fold too much information into a single number.
When reading the title, I thought "Simplified? ELO is too simple already!"
I agree. I simplified for two reasons: one, I only have so much time (I did this to take a break from the programming I'm supposed to be doing!), and two, as mentioned above, ELO (and most other models) have somewhat arbitrary numbers representing how much a win should improve your rating, etc. If I do move to a more complicated model, I would want to think for a bit about how to do that, and how to test if it's actually better (probably something like the test mentioned in response to Kirby's post).