John Fairbairn wrote:
But go is not only many times bigger - the task seems quite different. If it were just a question of searching for matches of whole-board positions using a cached, pre-hashed databank, that would not seem specially interesting. But Kombilo (a magnificent beast) allows you to make searches of rectangles of any size within a cache built (as I infer) from 19x19 hashing. I can't get my head round that at all except by imagining hashing and caching every possible rectangle, which is clearly nowhere near feasible. As a reference point, on my machine Kombilo seems to use about 2G on a 110,000-game database that occupies about 15MB. That seems insanely economical.
Can anyone lift the veil on the magic?
Kombilo (and I've used essentially the same algorithm in q5go) stores two bitmaps per game. These hold the positions where white and black stones appeared at any point during the game. It's not quite the same as the final position, since captured stones are also recorded here. I'm not sure I'd use the term "hashes" for these bitmaps, but I suppose one could.
The search is then a two-pass affair. First, for every game, you try to match the pattern against these bitmaps.
Let's say you're looking for
Code:
XO
OX
and your bitmaps contain, somewhere within the 19x19
Code:
OO XX
.O X.
then you know that the pattern
could match in this game, at this position, if you mirror it, since white and black stones appeared in the right places at some point during the game. So, you have a candidate match. To make certain, you then have to go through all the moves in the candidate games to see if the required pattern is actually reached. The number of mismatches in the candidate is 4 in an empty starting position, and you add/subtract to that number as moves and captures appear. If it reaches zero, you found a match.
I believe Kombilo also uses something else for each game that's more conventionally a hash value. These are used as (hopefully) unique IDs and IIRC they are used to identify games and print annotations such as "Commentary in Go World 72".