HACKER Q&A
📣 Eugeleo

Why doesn't GitHub save just the AST, letting devs decide on formatting?


Big companies & repos have standardised code styles, big parts of which is formatter settings. Would it be possible to track only the AST on git, leaving the formatting of the code 100% on the viewer?

That would allow different devs from the same company view the code however they like it — it would separate meaning from style. And I'm sure working internally with only the AST would bring some more benefits I just fail to see at the moment.

Is anybody working in this space? Or is it just too much work to be worth it?


  👤 gvb Accepted Answer ✓
GitHub is based on git and inherits its features, assumptions, and quirks.

1) Git is implemented as line-orientated differences between text files. This works well for any and every text file regardless of what the text represents. If you switched to ASTs, you would have to create an AST-oriented diff algorithm.

1a) You would have to write a language parser to generate the AST for every language you wanted to support.

2) ASTs generally don't represent comments. You would have to make your AST parser to understand and preserve comments.

2a) Often times comments are formatted in a specific way, e.g. using spacing and line breaks, to help understanding. If you reformat the code, it may destroy the information inherent in the comment formatting.

3) While there are "standardized" coding styles, there are innumerable corner cases that the styles don't cover explicitly or that programmers "interpret differently." Programmers tend to get unhappy if you reformat their code with ironclad justification and get really pissed if you do it without ironclad justification.

So, yes, IMHO it is just too much work to be worth it.


👤 richardjennings
what would a diff look like?