The KGS server keeps crashing
-
Suji
- Lives in gote
- Posts: 302
- Joined: Wed May 19, 2010 2:25 pm
- Rank: DDK
- GD Posts: 0
- KGS: Sujisan 12 kyu
- OGS: Sujisan 13 kyu
- Has thanked: 70 times
- Been thanked: 8 times
The KGS server keeps crashing
What's wrong with the server tonight? It's apparently crashed twice.
Hopefully, it's nothing serious.
Hopefully, it's nothing serious.
My plan to become an SDK is here.
- wms
- Lives in gote
- Posts: 450
- Joined: Tue Apr 20, 2010 4:23 pm
- GD Posts: 0
- KGS: wms
- Location: Portland, OR USA
- Has thanked: 257 times
- Been thanked: 287 times
- Contact:
Re: The KGS server keeps crashing
Not sure what it is. It is a crash that has been there since version 3.0.0; it's a bug that I haven't been able to track down. Usually it hits about once every 60 days, but in the past 24 hours it has hit 4 times instead.
It is possible that whatever causes the bug, somebody has decided to start doing that *A LOT*. But that would be strange, because I didn't think that the bug was caused by anything a user does. It is caused by memory corruption in my low level networking code. This code is extremely tricky, it is heavily multithreaded, written in C (the only part of the server that is), and uses the epoll Linux interface, which explains why there has been a bug that I've known about for 3+ years but haven't been able to fix.
If it keeps hitting...well, that will be useful information, but I'd rather get it another way of course.
It is possible that whatever causes the bug, somebody has decided to start doing that *A LOT*. But that would be strange, because I didn't think that the bug was caused by anything a user does. It is caused by memory corruption in my low level networking code. This code is extremely tricky, it is heavily multithreaded, written in C (the only part of the server that is), and uses the epoll Linux interface, which explains why there has been a bug that I've known about for 3+ years but haven't been able to fix.
If it keeps hitting...well, that will be useful information, but I'd rather get it another way of course.
- Ember
- Lives with ko
- Posts: 286
- Joined: Sun May 09, 2010 5:32 am
- Rank: EGF 3-4k - KGS 2-3k
- GD Posts: 0
- Online playing schedule: A schedule..? When hell freezes over... maybe. ^^;
- Location: Germany
- Has thanked: 146 times
- Been thanked: 81 times
Re: The KGS server keeps crashing
Well, I can't access KGS at all now (from Germany), the homepage seems down, too.
At first when I tried to login I immediately got the message that the server might be down, after waiting a bit that message took some time to appear and now there is no message at all but I still can't get onto the server, the client seems to just keep on trying and trying and...
But I guess you're already working on it, wms, so I hope that you'll fix that bug soon and everything will be allright.
EDIT: Please ignore the message below..
EDIT 2: Well, I guess it was a bit too early to triumph.. ^^; It crashed again.
At first when I tried to login I immediately got the message that the server might be down, after waiting a bit that message took some time to appear and now there is no message at all but I still can't get onto the server, the client seems to just keep on trying and trying and...
But I guess you're already working on it, wms, so I hope that you'll fix that bug soon and everything will be allright.
EDIT: Please ignore the message below..
EDIT 2: Well, I guess it was a bit too early to triumph.. ^^; It crashed again.
Last edited by Ember on Sun May 23, 2010 2:19 am, edited 1 time in total.
- wms
- Lives in gote
- Posts: 450
- Joined: Tue Apr 20, 2010 4:23 pm
- GD Posts: 0
- KGS: wms
- Location: Portland, OR USA
- Has thanked: 257 times
- Been thanked: 287 times
- Contact:
Re: The KGS server keeps crashing
Yeah, tried a reboot to see if that helped. It looks like "no."
The server hasn't had an upgrade for months. So something new changed outside the server that made this bug show up a lot more often. No idea what.
The server hasn't had an upgrade for months. So something new changed outside the server that made this bug show up a lot more often. No idea what.
- Phelan
- Gosei
- Posts: 1449
- Joined: Tue Apr 20, 2010 3:15 pm
- Rank: KGS 6k
- GD Posts: 892
- Has thanked: 1550 times
- Been thanked: 140 times
Re: The KGS server keeps crashing
There were some authentication problems with the desktop version a while back. Are you using that? either way, a screenshot or the details of the popup might help.
- CarlJung
- Lives in gote
- Posts: 429
- Joined: Wed Apr 21, 2010 1:10 pm
- Rank: SDK
- GD Posts: 0
- KGS: CarlJung
- Location: Sweden
- Has thanked: 101 times
- Been thanked: 73 times
Re: The KGS server keeps crashing
Helel wrote:...swedish text in picture...
Ha ha, visste inte att du var svensk.
FusekiLibrary, an opening library.
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
-
tj86430
- Gosei
- Posts: 1348
- Joined: Wed Apr 28, 2010 12:42 am
- Rank: FGA 7k GoR 1297
- GD Posts: 0
- Location: Finland
- Has thanked: 49 times
- Been thanked: 129 times
Re: The KGS server keeps crashing
Since my line is quite slow, I'd really appreciate if the pictures weren't several megabytes...
Offending ad removed
- EdLee
- Honinbo
- Posts: 8859
- Joined: Sat Apr 24, 2010 6:49 pm
- GD Posts: 312
- Location: Santa Barbara, CA
- Has thanked: 349 times
- Been thanked: 2070 times
Re: The KGS server keeps crashing
Helel, you can also try:Helel wrote:I only did it to annoy you.It will not become a habit.
(1) Alt+"Prnt Scrn" to grab only the error window.
(2) Paste it in an editor like Irfanview -- http://www.irfanview.com/
(3) Reduce the screenshot size.
(4) Save as jpeg at 80% quality (see attached).
This usually reduces the file size by over 90%.
Or, you can TYPE all the error messages by hand.
- Attachments
-
- xxx.jpg (61.49 KiB) Viewed 16278 times
-
- xxx1.jpg (55.81 KiB) Viewed 16278 times
-
tj86430
- Gosei
- Posts: 1348
- Joined: Wed Apr 28, 2010 12:42 am
- Rank: FGA 7k GoR 1297
- GD Posts: 0
- Location: Finland
- Has thanked: 49 times
- Been thanked: 129 times
Re: The KGS server keeps crashing
Helel wrote:Svenska talas ju till och med av skåningar och annat slödder.
Ja, även nästan tio procent av finnar talar svenska som modersmål (jag är dock inte en av dom).
Offending ad removed
-
Suji
- Lives in gote
- Posts: 302
- Joined: Wed May 19, 2010 2:25 pm
- Rank: DDK
- GD Posts: 0
- KGS: Sujisan 12 kyu
- OGS: Sujisan 13 kyu
- Has thanked: 70 times
- Been thanked: 8 times
Re: The KGS server keeps crashing
wms wrote:Not sure what it is. It is a crash that has been there since version 3.0.0; it's a bug that I haven't been able to track down. Usually it hits about once every 60 days, but in the past 24 hours it has hit 4 times instead.
It is possible that whatever causes the bug, somebody has decided to start doing that *A LOT*. But that would be strange, because I didn't think that the bug was caused by anything a user does. It is caused by memory corruption in my low level networking code. This code is extremely tricky, it is heavily multithreaded, written in C (the only part of the server that is), and uses the epoll Linux interface, which explains why there has been a bug that I've known about for 3+ years but haven't been able to fix.
If it keeps hitting...well, that will be useful information, but I'd rather get it another way of course.
Hmmm...Interesting. Hopefully, you can find it and fix it.
On a lighter note, there's two quotes that I thought of.
1. "If debugging is the process of removing bugs from a program, then programming is the process in which bugs are introduced to the program."
2. "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
@WMS: In what way would you like to receive the information?
My plan to become an SDK is here.
- wms
- Lives in gote
- Posts: 450
- Joined: Tue Apr 20, 2010 4:23 pm
- GD Posts: 0
- KGS: wms
- Location: Portland, OR USA
- Has thanked: 257 times
- Been thanked: 287 times
- Contact:
Re: The KGS server keeps crashing
When there's a bug, it's best if the bug shows up when I test so that I can fix it there.
But this bug happens so rarely, and I don't know how to make it happen, so I only get info about it from crashes...and very, very little info even then. All I've been able to pin down is that variables are getting utter nonsense in them. A boolean, for example, will have a number in it instead of a 0 or 1. I suspect that I'm walking past the end of an array in the network code, or something like that, but I can't figure out where that could be happening.
But this bug happens so rarely, and I don't know how to make it happen, so I only get info about it from crashes...and very, very little info even then. All I've been able to pin down is that variables are getting utter nonsense in them. A boolean, for example, will have a number in it instead of a 0 or 1. I suspect that I'm walking past the end of an array in the network code, or something like that, but I can't figure out where that could be happening.
- CarlJung
- Lives in gote
- Posts: 429
- Joined: Wed Apr 21, 2010 1:10 pm
- Rank: SDK
- GD Posts: 0
- KGS: CarlJung
- Location: Sweden
- Has thanked: 101 times
- Been thanked: 73 times
Re: The KGS server keeps crashing
Helel wrote:wms wrote:When there's a bug, it's best if the bug shows up when I test so that I can fix it there.
But this bug happens so rarely, and I don't know how to make it happen, so I only get info about it from crashes...and very, very little info even then. All I've been able to pin down is that variables are getting utter nonsense in them. A boolean, for example, will have a number in it instead of a 0 or 1. I suspect that I'm walking past the end of an array in the network code, or something like that, but I can't figure out where that could be happening.
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Ever heard of open source code...
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Have fun debugging!
That wouldn't really change the nature of the bug, and there would still only be wms who has access to the server where the problem occurs. More eyes on the problem, yes, but that's it.
FusekiLibrary, an opening library.
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
- CarlJung
- Lives in gote
- Posts: 429
- Joined: Wed Apr 21, 2010 1:10 pm
- Rank: SDK
- GD Posts: 0
- KGS: CarlJung
- Location: Sweden
- Has thanked: 101 times
- Been thanked: 73 times
Re: The KGS server keeps crashing
Helel wrote:Ahh, so the bug is in no way related to anything wms has coded. My bad.
It's quite possible that it is, it remains to be seen. But even so, open sourceing the code wouldn't make it any easier to debug. It's on the server the error occurs, and you can't give everyone access to it to tinker away at their hearts content.
FusekiLibrary, an opening library.
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
- CarlJung
- Lives in gote
- Posts: 429
- Joined: Wed Apr 21, 2010 1:10 pm
- Rank: SDK
- GD Posts: 0
- KGS: CarlJung
- Location: Sweden
- Has thanked: 101 times
- Been thanked: 73 times
Re: The KGS server keeps crashing
wms,
I'm sure you have a test server. Have the error ever occurred on that one? Can't we fill it with a few thousand weakbots/randombots that all play blitz in order to simulate some load? I have 10MBit upload that mostly sits idle and a quite powerful computer. I'm sure others have similar setups. That could potentially be a way forward.
I'm sure you have a test server. Have the error ever occurred on that one? Can't we fill it with a few thousand weakbots/randombots that all play blitz in order to simulate some load? I have 10MBit upload that mostly sits idle and a quite powerful computer. I'm sure others have similar setups. That could potentially be a way forward.
FusekiLibrary, an opening library.
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
SGF converter tools: Wbaduk NGF to SGF | 440 go problems | Fuseki made easy | Tesuji made easy | Elementary training & Dan level testing | Dan Tutor Shortcut To Dan
-
tj86430
- Gosei
- Posts: 1348
- Joined: Wed Apr 28, 2010 12:42 am
- Rank: FGA 7k GoR 1297
- GD Posts: 0
- Location: Finland
- Has thanked: 49 times
- Been thanked: 129 times
Re: The KGS server keeps crashing
CarlJung wrote:Helel wrote:Ahh, so the bug is in no way related to anything wms has coded. My bad.
It's quite possible that it is, it remains to be seen. But even so, open sourceing the code wouldn't make it any easier to debug. It's on the server the error occurs, and you can't give everyone access to it to tinker away at their hearts content.
What I remember from my merry days of coding and debugging C/C++ (which is what I suspect the code in question is), this kind of bug isn't often caught by debugging when the actual error occurs. The problematic code may have been executed well in advance. If the culprit is something wms wrote, then it might help to have several people look at it. Of course, if the bug may be virtually anywhere, it won't probably help unless it can be narrowed down. Of course one theoretical possibility is to run everything in debugger with watches guarding the memory that will eventually be overwritten, but that is (or at least used to be) much too slow. Perhaps debugging tools have improved since I coded for living.
Offending ad removed
