Thanks for your interesting suggestion :)
YeGo wrote:
What I meant is that dealing with very long comments without line breaks or ones that inherently contain new lines poses some additional challenges and design considerations for indenting SGF. It's quite common to see paragraphs of text within comments for annotated games. With the first case, most text editor/viewer would wrap long lines, which visually disrupts your indentation scheme. In the second case, one can't remove the hard line breaks, which are presumably desired by the user. Also, some text editors/viewers struggle with very long lines of text that are not naturally broken up with line breaks. I was suggesting that soft line breaks could be one way to address this last issue, but also raises some further design questions.
That's true, though I personally can accept the problem. I feel the following SGF is not so ugly:
Code:
<- editor width ->
(
;FF[4]
C[line 1
line 2
looooooooooooooong
line wrapped by
editor]
B[pd]
;W[qp]
)
How do you feel about the above case? We may be able to learn something from
JSON.stringify that has to handle a long string.
YeGo wrote:
If all SGF files were simply a linear record with zero variations, then of course it would be most natural to represent everything as a list of nodes. However, as soon as you allow for variations, you have to deal with an inherent tree structure. Then you are forced to created a hybrid tree of lists of nodes, which is more complex to traverse that a simple tree of nodes.
The structure of a game record is inherently a tree and can be easily represented as just a tree of nodes. Even a game record that is just one linear sequence with no variations is still a tree (just with each node having exactly one child, except for the last node which has no children). SGF is just representing tree of nodes in pre-order while dropping unnecessary nesting parentheses.
I think the original Perl module, and other projects inspired by it, have made a design error in misunderstanding the grammar of SGF to imply a more complex data structure than intended, which is a simple tree of nodes (see http://www.red-bean.com/sgf/var.htm). The lack of unnecessary parentheses (obviated by the use of semi-colons prefixing each node) perhaps is the source of this confusion.
I prefer practicality to universality in this case since we're not handling a generic tree
structure but a go game record played by human being. Though I'm not sure how computers think
about the next move, I think a sequence is the unit of the game record, not a single move.
Joseki is the good example. It's a set of sequences. I believe we can rebuild SGF considering
a sequence as the node of the game tree. At the risk of being misunderstood, I think the Perl
module adopted a sequence-oriented data structure. That's why I like it. Note that the Perl
data structure can be always converted into the data structure that your proposed , and vice versa.
YeGo wrote:
Consider these example grammatical structures for trees represented in pre-order:
Code:
(root(a(b(c)(d(e))))(f(g(h(i)))(j)))
(;root(;a(;b(;c)(;d;e)))(;f(;g;h;i)(;j)))
Both represent the same tree structure, but the first uses nested parentheses, whereas the second simply has removed the parentheses made unnecessary by the semi-colon prefixes. In this example, since there is a high branching factor, the second format (which is the one used by SGF) turns out to be slightly less efficient, but it has clear benefits in clarity (in removing some unnecessary parentheses) and efficiency for trees with lower branching factors.
That's totally true. Thanks to your clear explanation, we can understand why SGF was designed so.
YeGo wrote:
The Eidogo code gives an example of how to parse into a basic tree structure:
https://github.com/jkk/eidogo/blob/mast ... /js/sgf.js
Also, here are some other related open-source projects:
https://github.com/Kashomon/glift
https://github.com/IlyaKirillov/GoProject
Yeah, I've already read the (part of) code. They are great projects :)
My parser is something like this:
https://github.com/anazawa/sgf.js/blob/master/sgf.js#L951YeGO wrote:
Why shouldn't the users (presumably those using this code as a library to parse SGF files), directly have access to the data structure? If a further layer, provided by a visitor/iterator, is needed to encapsulate the structure, then the output is inherently this encapsulation. Does this encapsulation basically provide a tree of nodes type of abstraction? If that's the case, why not just make the underlying data structure a tree of nodes and give that directly to the user?
Because it's like touching
Rack's response array instead of using their middlewares or Ruby on Rails that is
a wrapper around Rack. And furthermore, SGF property names such as PB or PW are far from user friendly.
I'm talking about the user experience. If the user was a SGF expert like you, he/she would think
it's unnecessary to encapsulate the data structure. However, most of them including me are not.