It is currently Thu Oct 31, 2024 6:05 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next
Author Message
Offline
 Post subject: Contribute to Katago training using google colab
Post #1 Posted: Fri Feb 26, 2021 3:24 pm 
Dies in gote

Posts: 25
Liked others: 0
Was liked: 12
Rank: 18 kyu
I've made google colab notebook image which can contribute to katago training.
You can join the contribution without GPU now.
Just check the link below.
https://colab.research.google.com/drive ... sp=sharing


Attachments:
20210224_colab.png
20210224_colab.png [ 274.27 KiB | Viewed 13051 times ]

This post by seventeen was liked by 3 people: ez4u, go4thewin, wineandgolover
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #2 Posted: Fri Feb 26, 2021 5:36 pm 
Lives in sente
User avatar

Posts: 866
Liked others: 318
Was liked: 345
This looks cool. I’d love to know more. To start...

What gpu's does this use?

Is it free? For how long?

Is there some sort of limitation that might affect other google services?

Does it run in the background?

Thanks.

_________________
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #3 Posted: Fri Feb 26, 2021 6:21 pm 
Lives with ko

Posts: 150
Liked others: 200
Was liked: 30
Rank: 25 kyu
edit: Is it possible to make a script like this to train the 15b net on the new s663 40b data? thanks!


Last edited by go4thewin on Mon Mar 01, 2021 7:26 am, edited 2 times in total.
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #4 Posted: Fri Feb 26, 2021 8:40 pm 
Lives in sente
User avatar

Posts: 866
Liked others: 318
Was liked: 345
go4thewin wrote:
Really simple and fun to use. it uses a T4, it is free with usage limits. it turns off after 12 hours or less if you paste the following in the google chrome console (f12)
Code:
function ClickConnect(){
  console.log("Connnect Clicked - Start");
  document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click();
  console.log("Connnect Clicked - End");
};
setInterval(ClickConnect, 60000)


in 12 hours, you get more than 250 training games, 13000 rows, and a few ratings games, which is really nice. it will turn off in 90 minutes or less without the code above. if you close the browser, it will turn off. If you use it 12 hours everyday, you might get kicked off for a couple months, not sure. Every other day might be ok. It will not effect other google services. Thanks seventeen!


If one is non-technical, and has never messed with the chrome console, and doesn't want to screw things up, where exactly in the chrome console should one paste this? Top? Bottom, embedded somewhere, doesn't matter?

Also, I assume you mean. "it turns off after 12 hours or less UNLESS you paste..."?

Finally, I agree that it is really simple. Even I got it running easily. If you want to help make the strongest open-sourced go engine even better, please run this in the background when you use your computer. Highly recommended!!!!!

_________________
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #5 Posted: Sat Feb 27, 2021 3:03 am 
Lives with ko

Posts: 150
Liked others: 200
Was liked: 30
Rank: 25 kyu
Yes, sorry about that. In the picture below, paste the following code where the bottom most > sign is. Ill delete the previous redundant post.

Code:
function ClickConnect(){
    console.log("Clicked on connect button");
    document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)


This post by go4thewin was liked by: wineandgolover
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #6 Posted: Sat Feb 27, 2021 3:42 am 
Oza
User avatar

Posts: 2408
Location: Tokyo, Japan
Liked others: 2346
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
I am trying this out also. It is indeed simple to do. :tmbup:
I am running it in Firefox and for whatever reason, it does not shut down by itself after 90 minutes. Just now I came back after about 3 hours and it's still running. Great job! Thanks

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #7 Posted: Sat Feb 27, 2021 11:57 am 
Lives in sente
User avatar

Posts: 866
Liked others: 318
Was liked: 345
wineandgolover wrote:
This looks cool. I’d love to know more. To start...

1. What gpu's does this use?

2. Is it free? For how long?

3. Is there some sort of limitation that might affect other google services?

4. Does it run in the background?

Thanks.


To answer my own questions.

1. I’ve connected to a Tesla T4 each time. Getting around 390nn evals/ second

2. It’s completely free. It uses Google Colab, a free machine learning tool. It runs in the browser, so it’s platform independent.

3. It does not affect other google services. There is a limitation within Colab in that it will stop working for heavy users. Looking on discord, it seems running it for twelve hours every other day avoids any problems.

4. Yes it runs in the background. Google provides the CPU's and GPU's. All you need is a google drive account and a browser.

If you’d like to run it more stably, for longer, and probably get assigned a better GPU, you can consider Colab Pro, which costs $10 per month with a US or Canadian address. That seems pretty reasonable versus buying and powering your own Tesla V100. I might try Pro soon to check out it’s performance. I’d love to hear if anybody else has already done so.

Again, I encourage anyone who wishes to help make katago stronger to consider running this completely free utility in the background.

_________________
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #8 Posted: Sun Feb 28, 2021 3:48 pm 
Dies in gote

Posts: 26
Liked others: 0
Was liked: 8
I followed your instruction and got an error as following...

Starting KataGo training...
2021-02-28 22:39:59+0000: Distributed Self Play Engine starting...
2021-02-28 22:39:59+0000: Attempting to connect to server
2021-02-28 22:39:59+0000: isSSL: true
2021-02-28 22:39:59+0000: host: katagotraining.org
2021-02-28 22:39:59+0000: port: 443
2021-02-28 22:39:59+0000: baseResourcePath: /
2021-02-28 22:39:59+0000: KataGo v1.8.0
2021-02-28 22:39:59+0000: Git revision: 8ffda1fe05c69c67342365013b11225d443445e8
2021-02-28 22:39:59+0000: Running tiny net to sanity-check that GPU is working
2021-02-28 22:39:59+0000: nnRandSeed0 = 10486611865130445872
2021-02-28 22:39:59+0000: After dedups: nnModelFile0 = katago_contribute/kata1/tmpTinyModel.bin.gz useFP16 auto useNHWC auto
terminate called after throwing an instance of 'StringError'
what(): OpenCL error at /home/dwugcloud/data/kata/cpp/neuralnet/openclhelpers.cpp, func err, line 263, error CL_PLATFORM_NOT_FOUND_KHR

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #9 Posted: Sun Feb 28, 2021 4:42 pm 
Oza
User avatar

Posts: 2408
Location: Tokyo, Japan
Liked others: 2346
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
When the process is running on colab, I am seeing these security warnings constantly in the console
Code:
Content Security Policy: Ignoring “'report-sample'” within script-src: ‘strict-dynamic’ specified
Content Security Policy: Ignoring “https:” within script-src: ‘strict-dynamic’ specified
Content Security Policy: Ignoring “http:” within script-src: ‘strict-dynamic’ specified
Content Security Policy: Ignoring “'unsafe-inline'” within script-src: ‘strict-dynamic’ specified
Content Security Policy: Ignoring “https://www.google.com/js/bg/” within script-src: ‘strict-dynamic’ specified
Content Security Policy: Ignoring “https://www.google.com/recaptcha/” within script-src: ‘strict-dynamic’ specified

Is this something that should be fixed or can we just ignore it?

Meanwhile I am currently increasing the "maxSimultaneousGames" at the bottom of the script. Going from 8 (default) to 12 jumped "nn evals" from around 380/second to around 470/second.

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #10 Posted: Sun Feb 28, 2021 4:53 pm 
Oza
User avatar

Posts: 2408
Location: Tokyo, Japan
Liked others: 2346
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
deungsan wrote:
I followed your instruction and got an error as following...

Starting KataGo training...
2021-02-28 22:39:59+0000: Distributed Self Play Engine starting...
2021-02-28 22:39:59+0000: Attempting to connect to server
2021-02-28 22:39:59+0000: isSSL: true
2021-02-28 22:39:59+0000: host: katagotraining.org
2021-02-28 22:39:59+0000: port: 443
2021-02-28 22:39:59+0000: baseResourcePath: /
2021-02-28 22:39:59+0000: KataGo v1.8.0
2021-02-28 22:39:59+0000: Git revision: 8ffda1fe05c69c67342365013b11225d443445e8
2021-02-28 22:39:59+0000: Running tiny net to sanity-check that GPU is working
2021-02-28 22:39:59+0000: nnRandSeed0 = 10486611865130445872
2021-02-28 22:39:59+0000: After dedups: nnModelFile0 = katago_contribute/kata1/tmpTinyModel.bin.gz useFP16 auto useNHWC auto
terminate called after throwing an instance of 'StringError'
what(): OpenCL error at /home/dwugcloud/data/kata/cpp/neuralnet/openclhelpers.cpp, func err, line 263, error CL_PLATFORM_NOT_FOUND_KHR


My startup looks like this...
Code:
Starting KataGo training...
2021-02-28 21:51:52+0000: Distributed Self Play Engine starting...
2021-02-28 21:51:52+0000: Attempting to connect to server
2021-02-28 21:51:52+0000: isSSL: true
2021-02-28 21:51:52+0000: host: katagotraining.org
2021-02-28 21:51:52+0000: port: 443
2021-02-28 21:51:52+0000: baseResourcePath: /
2021-02-28 21:51:52+0000: KataGo v1.8.0
2021-02-28 21:51:52+0000: Git revision: 8ffda1fe05c69c67342365013b11225d443445e8
2021-02-28 21:51:52+0000: Running tiny net to sanity-check that GPU is working
2021-02-28 21:51:52+0000: nnRandSeed0 = 1331183443207076973
2021-02-28 21:51:52+0000: After dedups: nnModelFile0 = katago_contribute/kata1/tmpTinyModel.bin.gz useFP16 auto useNHWC auto
2021-02-28 21:51:52+0000: Cuda backend thread 0: Found GPU Tesla T4 memory 15843721216 compute capability major 7 minor 5
2021-02-28 21:51:52+0000: Cuda backend thread 0: Model version 9 useFP16 = true useNHWC = true
2021-02-28 21:51:52+0000: Cuda backend thread 0: Model name: rect15-b2c16-s13679744-d94886722
2021-02-28 21:51:54+0000: Tiny net sanity check complete

As far as I understand what we are doing here (questionable right there! :blackeye: ), should not be using OpenCL for anything. You should be using CUDA instead.
At the very beginning of the output from your run do you see...
Code:
Using Katago Backend :  CUDA
GPU :  TeslaT4
/content
Cloning into 'katago-colab'...

This is what I get every time.

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #11 Posted: Sun Feb 28, 2021 6:16 pm 
Dies in gote

Posts: 26
Liked others: 0
Was liked: 8
My errors are fixed by changing notebook setting. Setting hardware accelerator to "GPU' lets colab use TeslaT4.

Now it works fine.


Last edited by deungsan on Sun Feb 28, 2021 6:33 pm, edited 2 times in total.
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #12 Posted: Sun Feb 28, 2021 6:23 pm 
Oza
User avatar

Posts: 2408
Location: Tokyo, Japan
Liked others: 2346
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
ez4u wrote:
...

Meanwhile I am currently increasing the "maxSimultaneousGames" at the bottom of the script. Going from 8 (default) to 12 jumped "nn evals" from around 380/second to around 470/second.

The story so far...
Code:
"maxSimultaneousGames" 08  "nn evals"  ~380/sec
"maxSimultaneousGames" 12  "nn evals"  ~470/sec
"maxSimultaneousGames" 16  "nn evals"  ~525/sec
"maxSimultaneousGames" 24  "nn evals"  ~545/sec
"maxSimultaneousGames" 32  "nn evals"  ~555/sec

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #13 Posted: Mon Mar 01, 2021 5:45 am 
Lives in sente

Posts: 758
Liked others: 114
Was liked: 916
Rank: maybe 2d
Yeah, GPUs really like it when you have large batches to run in parallel, and more games helps with that.

The one thing I would caution is - please don't make the number of simultaneous games too large compared to the number of games you are playing in a given run before you shut it down or it shuts itself down - ideally make sure the total number of games you're getting per session would be at least 10x or 20x the number of simultaneous games.

The reason is if the total is too small, such that the games are coming in relatively few "waves" before it gets killed, it will create a bias towards short games in the data - because in the last wave, disproportionately short games will be the ones that finish and get uploaded and not the longer ones. Games that were on small boards, or that had fewer fights and were more peaceful, or that initialized starting from later positions, etc. will be favored over the configured and desired distribution.


This post by lightvector was liked by 2 people: ez4u, wineandgolover
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #14 Posted: Tue Mar 02, 2021 2:08 pm 
Lives in sente
User avatar

Posts: 866
Liked others: 318
Was liked: 345
ez4u wrote:
As far as I understand what we are doing here (questionable right there! :blackeye: ), should not be using OpenCL for anything. You should be using CUDA instead.
At the very beginning of the output from your run do you see...
Code:
Using Katago Backend :  CUDA
GPU :  TeslaT4
/content
Cloning into 'katago-colab'...

This is what I get every time.


Yeah, I also saw that every time until last night.

I ran the Colab script very late last night, and it assigned me to an A100, so I was excited. (Note, this information is wrong, and corrected in subsequent posts) But it didn't use CUDA, instead opting for OpenCL. I waited a half hour and it hadn't finished any games, so I quit and went to bed. I can't guarantee that my interpretation of what happened is perfect, but I think I'm right. Maybe the cat hit a kill switch. I should have taken a screenshot, sorry.

I assume the A100 should support CUDA, right?

Is there a known reason why OpenCL should fail on Colab?

Should I change the very beginning of the script currently set to KATAGO_BACKEND="AUTO"
to read KATAGO_BACKEND="CUDA"?

Anyway, today I'm back on a good old T4 and it chose CUDA again. Following recommendations, I increased the number of games to 16, and it's chugging along nicely. (530'ish nn evals /sec).

Thanks!

_________________
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders


Last edited by wineandgolover on Tue Mar 16, 2021 3:35 pm, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #15 Posted: Tue Mar 02, 2021 8:02 pm 
Oza
User avatar

Posts: 2408
Location: Tokyo, Japan
Liked others: 2346
Was liked: 1332
Rank: Jp 6 dan
KGS: ez4u
At the beginning of the script, probably we can take this
Code:
  if gpu_name == "TeslaT4":
    KATAGO_BACKEND="CUDA"
  else:
    KATAGO_BACKEND="OPENCL"

and make it this?
Code:
  if gpu_name == "TeslaT4":
    KATAGO_BACKEND="CUDA"
  elif gpu_name == "A100":
    KATAGO_BACKEND="CUDA"
  else:
    KATAGO_BACKEND="OPENCL"

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #16 Posted: Tue Mar 02, 2021 9:22 pm 
Lives in sente
User avatar

Posts: 866
Liked others: 318
Was liked: 345
After a few searches, I now suspect I was connected to a P100, not A100. A huge difference of course.

I'd still guess that P100 supports CUDA though.

_________________
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #17 Posted: Thu Mar 04, 2021 12:39 pm 
Lives in sente
User avatar

Posts: 866
Liked others: 318
Was liked: 345
wineandgolover wrote:
After a few searches, I now suspect I was connected to a P100, not A100. A huge difference of course.

I'd still guess that P100 supports CUDA though.

Again to answer my own question.

I was so excited to contribute to KataGo that I used the script a lot and Google put the breaks on my use of Colab. (Don't run it all the time, if you want to use it for free, folks)

I've upgraded to Colab Pro, because I'm still excited to help. And paying $10/month for the duration of the project is a hell of a lot cheaper than buying and running a powerful GPU.

Since upgrading, I've 100% been assigned P100 GPU, which I believe is supposed to be better than the T4. Perhaps that is so, but not for this job, with this Colab script.

After several tests on the P100, in which I forced it to use CUDA, it worked, but was significantly slower than OpenCL. So the script is right to use OpenCL.

Unfortunately, the P100 running OpenCL is also slower than the T4 GPU running CUDA. I am currently getting around 340 nnevals/second with 8 simultaneous games, and 385 nnevals/second with 16. (versus 380 and 525 nnevals/sec for the T4 running CUDA)

Unfortunately, with Colab, I believe you have to take the GPU you are given. And because I am a good paying customer, I consistently get the "superior" P100.

The good news is that I seem to be able to run scripts in two browser windows, and it disconnects far less often. And so far, Google hasn't told me to back off.

I still strongly recommend every go player try it. It's kind of like the SETI project for go players, except that Google is running the GPU's, so all you need is a simple browser tab. It's also easy, I'm running the script from the first post, just with my user name and password, no other changes. And it's free for those of you less obsessive than me.

_________________
- Brady
Want to see videos of low-dan mistakes and what to learn from them? Brady's Blunders

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #18 Posted: Mon Mar 08, 2021 7:17 am 
Lives with ko

Posts: 150
Liked others: 200
Was liked: 30
Rank: 25 kyu
If you don't want to train katago, but just play rating games, would this work?
https://github.com/portkata/KataGo/blob ... ames.ipynb
probably have to replace cuda with auto.

Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #19 Posted: Mon Mar 15, 2021 12:50 am 
Dies in gote

Posts: 25
Liked others: 0
Was liked: 12
Rank: 18 kyu
Notebook image's been updated to use Katago v1.8.1 today.


This post by seventeen was liked by 2 people: ez4u, wineandgolover
Top
 Profile  
 
Offline
 Post subject: Re: Contribute to Katago training using google colab
Post #20 Posted: Mon Apr 19, 2021 1:28 am 
Dies in gote

Posts: 25
Liked others: 0
Was liked: 12
Rank: 18 kyu
KataGo v1.8.2 engines have released today.

So I've updated colab image to use new KataGo version.

You may need to copy colab image again so far.

Thanks.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group