Life In 19x19
http://www.lifein19x19.com/

Question about KataGo
http://www.lifein19x19.com/viewtopic.php?f=18&t=17365
Page 2 of 2

Author:  gcao [ Mon Jun 01, 2020 5:50 am ]
Post subject:  Re: Question about KataGo

Great! Thank you!

Author:  gcao [ Mon Jun 01, 2020 10:49 am ]
Post subject:  Re: Question about KataGo

I updated the opencl kernels and have created a PR. When you get a chance, I would appreciate if you can take a little time to review the PR and see whether I missed any place. If not, I'll pull latest code to my repo and start training.

Whole PR: https://github.com/gcao/KataGo/pull/3/files

OpenCL kernel changes: https://github.com/gcao/KataGo/pull/3/f ... 028e9db1cd

Not sure whether I need to update this place as well.
https://github.com/lightvector/KataGo/b ... s.cpp#L125


Here is a short summary of the design of supporting Daoqi in the board.h/cpp and related code.

I moved the diag offsets to a separate field called diag_offsets. Both adj_offsets and diag_offsets take 8 values. The first 4 are for regular board positions, the last 4 are for edge positions.

In all places where adjacent points are computed, we use adj_offsets[0-3] to find adjacent points and check its value. If it's C_WALL, we use adj_offsets[4-7] to get the alternative adjacent points.

I hope this will work but if you find any issue with this, please do let me know. I don't want to start the training with any wrong design/implementation because I feel it's very hard to catch these issues during the training phase.

Author:  lightvector [ Mon Jun 01, 2020 11:13 am ]
Post subject:  Re: Question about KataGo

Cool!

I think it's very hard to try to catch bugs by reading the code for these kinds of changes - indexing code is easy to mess up in a way that a casual skim doesn't see, and a lot of kinds of bugs are from simply forgetting to change a place when it should be changed, which won't show up in a diff at all.

Your best bet is to test, test, test. Not by running the whole training loop, but by writing some code to interactively use the board in board.h to see if the rules are actually enforced the way they should be, create some sample positions to see if the ladder detection code works properly across the border, see if the pass-aliveness detection code correctly computes pass-alive groups across the border, etc. The existing tests for many of these things are in tests/testboardbasic.cpp, tests/testboardarea.cpp etc.

Similarly you can also test the neural net individual layers to see if they work as expected. You can look at cpp/tests/testnn.cpp for an example of this currently - it manually sets up some small input planes of simple floating point values, with some artificial convolution weights, applies the convolution, and compares them to the expected output (which I computed by hand when writing the test, pretty easy if all the numbers are simple).

Take a look at command/runtests.cpp to see the top-level code that calls down into these tests, many of which are probably broken now since they assume the original rules. Not that you need to worry about fixing them, but you can of course model after them in writing your own tests. Although sorry for the ad-hoc-ness and not using one of the more standardized C++ testing frameworks.

You can also do testing on the tensorflow side. Unfortunately TF1.5 and the complicated estimator interface makes it harder to test, but I think one way that should work is to either call out to or even just copy-paste the relevant functions that you modified, and then switch into eager mode ("v1.enable_eager_execution()") and then interactively in a python shell you can create some tensors with some values in them, call your convolution, and verify that the output tensor is the correct shape and eagerly computes the correct values.

Page 2 of 2 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/