Thank you for your answer, but I think you misunderstood my point. The goal would not be to test the algorithm itself, it would be to get a rough idea of the theoretical best prediction rate that could be achieved ("rough" because it would be an approximation of the KGS database).Rémi wrote:Rating algorithms must be tested on real data. You can generate artificial data based on some model, and then the best rating system would be the rating system that assumes this model. But the fact that an algorithm is the best to predict the artificial data does not imply that it will be the best to predict the real data. The only way to measure the ability of an algorithm to predict real game outcomes is to measure how well it predicts real game outcomes.
Going from 55.7% to 55.8% of prediction rate would look very different if you can give a convincing argument the prediction rate should be capped at around, say, 57%, as compared as if you could (theoretically) reach 100%.