I suppose he could read an entire database of games into RAM. If we assume a rough upper bound on the number of moves of 361 (19*19), with each move represented in 2 bytes (a byte per coordinate) and then round up to the nearest kilobyte (because working with multiples of 1 is easier), that gives us 1kb per game. You could easily fit a few hundred thousand of such into RAM these days .
nagano wrote:Let's just say that I have an image in my mind of a go database program which is far more extensive and sophisticated than any that currently exists. If I manage to implement even half of my ideas, it will be revolutionary. I know it will probably take me a long time, and eventually I'll need to bring others on board. I don't want to give specific features away at this point since I don't want my ideas stolen, as the program could potentially become a commercial venture.
Does that change anyone's answer?
19/02/2011: this grumpy person takes a voluntary holiday from L19.
nagano wrote:Let's just say that I have an image in my mind of a go database program which is far more extensive and sophisticated than any that currently exists. If I manage to implement even half of my ideas, it will be revolutionary. I know it will probably take me a long time, and eventually I'll need to bring others on board. I don't want to give specific features away at this point since I don't want my ideas stolen, as the program could potentially become a commercial venture.
Does that change anyone's answer?
What do you mean?
"Those who calculate greatly will win; those who calculate only a little will lose, but what of those who don't make any calculations at all!? This is why everything must be calculated, in order to foresee victory and defeat."-The Art of War
I was just wondering what assumptions people were making when suggesting languages and whether a little more context might change their suggestions, though mostly the advice has been somewhat general, so I think it probably won't change.
19/02/2011: this grumpy person takes a voluntary holiday from L19.
"Furthermore, a higher level language (which according to some benchmarks may be slower) may make it easier for you to select the right data structures and algorithms, reducing the algorithmic complexity of your end result and actually giving you a program that solves the problems you are interested in faster"
See -- I am disagreeing totally with this. The selection of data structures and algotrithms should be taking place in the design phase where everything is still at an abstract level and haven't decided things like "what langauge will this be implemented in" (of course that might be a "shop standards" issue, already predetermined which langauges allowed for production programs in the shop --- maintenance of programs usually a more important consideration than a slight improvement in the efficiency of any one program so sensible to restrict the number of langauges a shop has to have experience on hand to support).
If you instead start with the langauge and what data structures and algrotihms come built in then you are very likely to make data structure and algorithm decisions based on that and can almost guarantee bad choices.
fwiffo wrote:IMO, prototyping in a high level language is the design phase.
I can see some advantages to this in the later portion of the design phase, but sitting down with pen and paper (usually using actual pen and paper) at the beginning of the design phase and putting down the fundamentals has always worked better for me than blind (or semi-blind) prototyping as you describe above. Jumping in and hammering out a quick code prototype has its usefulness, though, I do admit.
fwiffo wrote:IMO, prototyping in a high level language is the design phase.
Care to try an example? After retiring from a career in large financial systems I decided to treat myself to learning C. For a "case problem" I chose writing a compress/decompress pair for a finite version of the Lempel-Zev 2nd Universal Data Compression algorithm. Definition:
Given an initial dictionary, transmit the number of bits that are the ID of the longest match in the dictionary followed by the character caused the mismatch and now add that string (the one that matched plus the character) to the dictionary. In the finite version, once the dictionary has reached some maximum size, remove the least recently used dictionary element in order to add the new one.
Sorry but I can't show you my solution as had a house fire in 2006 and backups did not survive the heat (now kept in a fire safe with a copy in a different building). But this wasn't a particularly large program. Part of this learning exercise was "standards" so had all the features like "about", "help" etc. and was heavly commented and I don't cram stuff on a line and still the compress/decompress program (could do either) was well under 2000 lines.
What I'm trying to say here is that how could you prototype in a high level langauge first? Without first visualizing what sort of structures you will want for "dictionary", "dictionary element", "least recently used", etc.
Well, C is not exactly a high-level language. I'm thinking something like Python. And my statement was hyperbole (depending on the size of the project). But prototyping parts of the problem is a large part of the process.
So I might start by just picking one of the more interesting functional units, like the dictionary search and just get it working with whatever data structure was convenient and a completely naive algorithm like a linear search. Just to get it working - maybe it'll be stupid and slow and only work on a data set, but I just want to have some logic there to look at. Just doing the simplest thing that could possibly work.
Then I'd decide if the simplest approach is "good enough". Sometimes it is. Sometimes it needs some trivially improvement to be "good enough", like a binary search, or a hash table.
I'd repeat the process for other parts of the program, keeping in mind what I'd learned from other parts I'd worked on, and make refinements along the way.
For me, trying things out is the best way to get a feel for the problem. I'll have a much better idea of what sort of data structures I'll need, what the pitfalls are, what the bottlenecks are, etc. once I've played with some actual code. And in a language like Python, you're almost writing pseudocode, so it's not much different than hashing out ideas on paper, except you can actually test it out.
Granted, my day-to-day work is database backed web sites, which tend to be completely uninteresting from a CS/algorithms point of view.
Gee, and I thought the advantage of "high level" languages was supposed to be that they directly supported high level structures.
"So I might start by just picking one of the more interesting functional units, like the dictionary search and just get it working with whatever data structure was convenient"
Search what? (what is the structure you will search)
Here's a hint -- the dictionary structure that would be "convenient" for this problem is the "n-tree" and as the stream in input characters come in you don't "search" it but traverse it to the farthest node you can reach (that's the longest match). The character coming in that didn't match would be added as a child (added to the child list of this node). Don't be misled by thinking that conceptual string matching means any need to store strings as a data type.
For the nodes you might try a structure that had an ID (an integer) and a value (the character) and a number of links. Going to have to be able to delete as well as add nodes and be able to traverse backwards from node to root and to keep track of least recently visited. So links to left and right sibling, link back to parent (for decompressing), link down to child list, link to next more recently and next less recently visited. While the dictionary is below mamimum size these can be kept on an "available" list assigning one of the links for that purpose.
Look, I chose this example precisely because while parts of the structures handy would be available as built in objects in a higher level language like C++ others would likely not (does it have an "n-tree" primative? -- the more common "b-tree" perhaps). And of course because making heavy use of linked lists (mainly double because intensive moving nodes around and the need to traverse both ways).
Diving right in to code is a bad way to do any program more difficult than you can visualize in its entirety in your head (for me that might be substantial). If you think you might even have to "hunt and peck to see what works" you've bitten off more than you can chew that way. Design top down and design first.
We're talking past each-other. Mucking around in code to get ideas doesn't preclude top down design, it informs it. Rapid prototyping in software is quite common these days.
I did not read the whole thread, just the last few posts.
Mike, you seem like a competent coder. I'm going to guess you're professional coding experience was from, say, late 80s to early 90s. To be frank, though, much has changed in what is considered "best practices". Most of your experience as far as project planning would probably be considered out-dated in the modern workplace (with the exception of work on embedded systems probably). Out is the idea that you should design everything upfront, and in is the idea that you should get something roughly working NOW and iterate on it. The idea being that clients are stupid, and even when they tell you what they want they have no idea what they really want. Even (and especially) when the "client" is you, the coder.
So I'm going to side with fwiffo here. Get your SQL database of go games or whatever, and your language (python is as good a choice as any, though I'd personally prefer C# because of the IDE), and just hack away with really naive code to get something that more or less works. Be as naive in the implementation as you need to be to get it working fast (fast meaning how much effort you put in. Not in how fast it runs). Then use it (play against it), and refine.
My personal opinion on business practices (past and present) is that different projects benefit from different design processes, and a good number of modern code houses (from small operations to enterprise) have become smart enough to tailor their design processes to each individual product. (And, of course, there are a good number of code houses that cling to a single design paradigm, with mixed results.)
My personal design preferences: design first. Forgetting about work processes, I prefer to design my personal projects before writing them out. I get things done quicker that way.
This doesn't mean I won't design iteratively and prototype early when the need arises. It simply means I like to start with a set of non-trivial requirements fleshed out before I start writing code. Makes my personal coding experience much smoother and easier to handle, given that I spend less time in front of a computer screen ... I don't like spending long periods of time writing code.
For the record, I currently do code as part of my position at work. I'm happy because it's not my entire role, and I can switch gears frequently throughout the day to keep my brain going.
If I might argue against myself for a second, it actually really depends on what you're doing. Mike's example is a very precise specification, which makes a very planned design a lot more possible.
But here's a fictitious but typical specification that I see regularly in the sort of work I do (web stuff).
----
The client wants a web site. They don't know what they want it to look like, but they have a couple competitor sites they want to look better than, and they have some other web sites that they like the look of. His daughter is an art student and thinks the color blue is really great.
The site needs to be easy for them to update, and it also needs to have a way for their clients to search their inventory. They have a separate inventory system which our code will need to import from. They can't tell us anything about the file format it exports, but they can give us an example file. They're not sure yet how we can automate the export process. They also don't know what the search form on the web site will look like, or what fields it will be searching.
They also have some flyers, manuals, and marketing literature that can be cribbed from for the content of the web site.
----
That specification is obviously terrible. But that's what I have to work with. And I also know it's going to change every time we talk to the customer.
The first thing I need to do is roll out an instance of our (constantly changing) content manager. We need to have that in place immediately so that our content people can start putting in web site content, even though we don't have a design or any of the custom features done, otherwise it would take too long.
Next I need to get started to work on the custom code for the site (the inventory import and search) even though the specification is maybe 10% complete. I can't wait until the graphic design or content is done, or until we know more about how their inventory export works, otherwise it will take too long to get the site out.
The best way to get started on that is to take a look at that sample file and see if I can figure out what's in it. I may or may not be able to find an exact specification for the file format - they have some weird in-house system which might not even have a specification. But it's almost always CSV. I just need to figure out which flavor, what character encoding, and what the different fields are, and how they're related. The best way to do that is to write some quick-and-dirty code to parse the sample file and print it out in a human readable format.
Once I've figured out the file format, then I can design a database schema that can represent the data and write a script that can import that data in an automated way. I still don't know if I'm going to be getting it from an FTP server, or if it's going to be manually uploaded somehow, or from a SOAP call or what.
Once I've got some sample data in the database, I can write the search. I pretty much have to guess how they want the search to work because the client isn't ever going to be able to articulate it. But I can make a pretty good guess because I've learned about how their data works from reverse engineering their file format. Once I have a prototype search coded, we can show it to the client and see what direction it needs to go from there.
----
That's the sort of situation where I think this sort of prototyping and playing with code is most helpful. The specification is incomplete constantly changing, and the project is not terribly large. If you have the luxury of a detailed specification, you can do a lot more up-front design.
Well, even with so-called lightweight methods, you still need to have some idea of what you'll be doing beforehand. I suspect that if you can just jump in and be successful with apparently little thought, then you're simply underestimating what thought you're actually putting into it or the results of prior experience allowing you to rapidly choose a roughly appropriate solution. Just as in go where a strong player might be able to play flexibly without planning out his whole game because he's built up the unconscious mental machinery to evaluate board positions quickly and identify what moves may reasonably lead to workable strategies.
19/02/2011: this grumpy person takes a voluntary holiday from L19.