Good code is different depending on your sense of aesthetics, but also it's purpose.
Good code for an enterprise-life-blood type of system is terrible code for a "let's check if this is an idea" type of prototype (and vice versa).
For a relatively large and mature project, someone might suggest Linux, but it's a bit hardcore in my opinion. FFmpeg is actually really nice, and you can wrap your head around the general idea of how that system works, to the point where you can comfortably add new options and even introduce new codecs and containers in a few days.
https://cs.opensource.google/go/go/+/refs/tags/go1.21.0:src/...
https://cs.opensource.google/go/go/+/refs/tags/go1.21.0:src/...
(Tooting my own horn) A Fitbit watchface that I wrote a few years back: https://github.com/GWBasic/Binaryish-Clock
An event/threading library for C#. I keep a fork in my Github because the original source was archived: https://github.com/GWBasic/retlang
Note that both examples are "functionally obsolete." The Fitbit studio environment is deprecated in favor of Android Watch; and if you're using C#, you can should be using Tasks to get similar functionality to Retlang.
Good code is working code, code that pays the bills. Focus instead on writing code you can throw away easily, code that you are wholly unattached to and is isolated enough that rewriting it won’t cost absurd hours.
https://github.com/Svxy/The-Simpsons-Hit-and-Run/tree/eb4b34...
s/read/train my AI on/
If you're looking for programming inspiration I love these videos
Jon Bentley - Three Beautiful Quicksorts https://www.youtube.com/watch?v=aMnn0Jq0J-E
Bret Victor - Inventing on Principle https://www.youtube.com/watch?v=PUv66718DII
The first is Jones Forth (https://github.com/nornagon/jonesforth), start with jonesforth.S and move into jonesforth.f. I really enjoyed following along with it and trying my hand at making my own stack based language.
The other is Xv6, a teaching operating system from MIT (https://pdos.csail.mit.edu/6.828/2021/xv6.html), not all the code or implementations are top notch but it shows you non-optimized versions (just because they're simple and more readable) of different concepts used in OS design.
If you're interested in the embedded world, there is a really neat project I've been following that feels a more structured and safe (as in fault-tolerant) while still staying pretty simple (both conceptually and in the code itself): Hubris and Humility (https://hubris.oxide.computer/).
I’m sure I’ve seen some HN posts about this article, but I can’t remember the content of such.
Any code that is used and minimally hated by a large number of people is good code in my opinion. It's valued by its users, and ultimately that is what matters.
It may not be perfectly DRY, use popular abstractions, or whatever people today think of as ideal code, but successful software projects solve real problems every day. The authors have likely done a good job of balancing usability with code quality. And I think that's the best we should hope for.
And then read any code through that lens. Then read some different code and contrast it. What did you like more or less? What worked and what didn't? Where did it work and where didn't it?
Remember that code that is fantastic is some ways is often horrible in others (e.g. the legendary fast inverse square root).
Approaching it this way helps one consider the reasoning behind what makes certain code good, and forces one to examine the context of the code, which is also critical. And being opinionated helps you remember to apply those rules in the future.
- Ratio of code:documentation INSIDE the source code
- Directory structure depth is “just right”; not too deep nor too shallow
- Number of dependencies is “just right”; don’t build things yourself, but also don’t import the whole world
- TTLD (Time To Local Dev); how simple is the getting started guide in terms of copy-pasta commands + automation + the right amount of context + easy-to-use tooling
- Code culture; follow industry best practices and make it clear where & why you deviate
My personal favorite one: `make todo_list`
We use keywords (TODO, OPTIMIZE, HACK, etc…) through the codebase and make them easily searchable with make helpers.
Ref: https://github.com/pokt-network/pocket/blob/main/Makefile#L5...
Note that "WEB" here is not "web" as in website.
Actually most of the big Square OSS libraries are great to read - okio, okhttp, picasso.
(Unfortunately many people are chasing the Holy Grail of "functional programming" and never finding it because "functional programming" is a pale shadow of what's possible when you understand how compilers work: this is how Common LISP and scheme are so much more profound then, say, Haskell)
It's a little out of date but I was lately thinking up about the Scott Adams adventure games of the early 1980s that were written with a specialized interpreter which could be implemented in BASIC but was also implemented in assembly language for better performance. See
https://ahopeful.wordpress.com/2020/09/13/digging-up-adventu...
If you tried to implement a game like that directly in a language like BASIC you would be driven nuts because that kind of game is fundamentally "object oriented" in that there are a number of things like rooms and items that are all mostly the same except they are different in some ways and trying to code that with IF, THEN and ELSE is bad enough even before GOTO gets added to the mix. The thing is that GOTO becomes quite benign and even useful when it is used to implement interpreters.
So back in the day you would study systems like that to stretch your skills, today I would look at compilers and related technology, like the Jena rules engine.
Specifically something polished that also has complex UI elements like block indicators or embedded modals.
I feel like I get a lot more out of messing with / hacking on code than I do from reading it. I'm sure people vary, but I've got loads more out of open source contributions to sometimes small projects, and not very much out of trying to do something like read the code for the Glasgow Haskel Compiler or something.
So for code to read, I think having an in is crucial, at least for me. So I'd say, find a cool, maybe small open source project, look at the issue tracker if it has one, and try to implement something. You'll only really know if it's good code (or why) after you start trying to change it.
Someone can tell you "This is good code." but good for what? Why is it good?
It is fast code? Is it highly maintainable? Is it well documented and kept up to date? Is it a code that is highly reliable? Is it code that solves and important problem?
My rule of thumb is: Ugly code usually comes from ugly problems. Ugly code can often be some of the most valuable code, because... it does the ugly things! It does what we want 99.9% of the time, using heuristics, and other nasty stuff.
So don't judge code on if it is "good" or not. Judge it on if it does what the author intended, and if it doesn't suck too badly to read with no reason.
Code bases I've worked in and have opinions on:
Samba: Good code base, but you MUST understand the idioms of the codebase, or it is absolutely horrible. It also alas, has the wisdom of 20+ years of existence in it... so it isn't always pretty.
Illumos/OpenSolaris: Nice codebase. Get the SmartOS distribution and you can literally type a few commands and build an entire OS and userland.
FreeBSD: See above. Great codebase, ans also, it can build userland + kernel, though it takes a few more commands. I'll admit I haven't read this one in 20 years. But I always found it a good codebase to work in back when :).
Grab the source for a library you use all the time, you know the useful one but the API feels a bit off... Download it and look at why the API is the way it is.
When looking at code, do NOT neglect looking at the history of a given file or piece of code, it often can teach you quite a bit. :)
The runtime/std lib of the PL of your choice. Your text editor or a plugin that you use etc.
Instead: write code for whatever fits your fancy, and request experts to give you feedback.
I would look for places the code seems to be really liked by its users. Maybe its very reliable, or extensible, or fast, or something else. How do they achieve that? Why do the users say these things? How do they measure / focus / make tradeoffs to focus on those attributes?
Then for painful to use software & common painpoints, why does this happen? Is it a fundamental design decision? Is it just sloppy code? Is in just intentionally slow to be more user-friendly? Or hard to read code because the focus is on speed?
It's all about the tradeoffs and intentional choices...
Lots of source these days has auto-documentation comments. Good IDEs present that documentation, which helps guess what might be worth diving into the Step Into rabbithole.
Often, from a high-quality framework / library I learn a bunch about handling weird edge cases and about writing code for long-term maintainability. And, I often learn some useful constructs and techniques. (And, it's possible to learn useful things from not-so-high quality code too.)
The difference in a well structured codebase is that some of the code prevents you from having to read huge amounts of other code. All code is bad, it starts out bad just by existing, it's only redeeming quality is preventing you from having to deal with more bad code.
Everyone thinks they write good "clean" code, and it's never true. Good programmers are good because of the architecture of their code, not because a single excerpt of code in isolation looks a certain way.
What you really want to read about are good designs. Read APIs, models, concepts, schemas, etc.
Another comment mentioned the Go standard library, and I totally agree. But stop at the APIs, if you look inside, you'll see that it's also mostly garbage. It's good because the APIs are good, and you don't have to read the rest.
I think reading code itself is only valuable when you need to explore a specific domain. Trying to extract coding patterns from an unfamiliar domain very difficult; it comes with unseeable assumptions.
https://github.com/grantjenks/python-sortedcontainers/blob/m...
You would hopefully find that the most compelling aspect of well-designed software systems is the data. In "data-driven" applications, 100% of the application state and configuration can be made to live in a database somewhere. In these scenarios, seeking code examples is not going to tell you much of anything.
My advice is to look at a bunch of SQL schemas (ideally, ones you know to be under successful products) and compare them to the problem area they support. Think about how you would answer questions a reasonable person might ask of that business by way of a query. Then, consider how much code you just now did not write to answer a realistic business problem.
Relational modeling can eliminate entire repositories worth of bullshit code that should have never existed in the first place. Do you want to train yourself to rely upon something that a true wizard can walk in and disaparate at the snap of his fingers?
In general, have a look at the Standard Template Library or Boost examples directory. Then some unit tests for simple GNU programs similar to what you are building, CLI command source like "ps" for OS interactions, and finally an OS kernel like Linux or *BSD. There are also several online classes offered by linuxfoundation.org etc.
Start with a small SBC like a pi4/BeagleBoard, and learn how to snapshot disk images (you will severely damage things while learning). There are also several open syntax formatting standards published by projects (and companies like Google), that will guide you on the local ecosystem.
Expect a Hazing in some places, as some folks tend to forget they were students once too.
It would also be wise to spend a few days studying security-auditing-tools, as one may learn to mitigate common ways people will try to break stuff. Detection and incident-handling is arguably more important than outright prevention.
Happy coding, =)
* Protocol implementations(e.g. TCP, HTTP, MIDI)
* Smaller compilers(PUC-Rio Lua, Forth-80)
* Commercial video game sources [0]
When studying protocols, you can compare apples-to-apples because the protocol has to work the same way by design, but the implementation can vary. With compilers, you're getting a look into programming in its maximally symbolic form - and every strategy a compiler uses is one you can directly apply to abstract your own code. And commercial video games have another mode of apples-to-apples in that the original release - the dirty, meets-deadline stuff - often can be compared with fan remakes and patches, which have the luxury of an exact specification and no deadlines. To actually ship in industry, you have to accept and know good dirty code hacks, but it's worth comparing them to their counterparts.
[0] https://en.m.wikipedia.org/wiki/List_of_commercial_video_gam...
1. Busybox. It's basically a collection of common Unix utilities, from common commands like ls or cat, to system daemons like crond and init. It's a good way to learn more about how Linux and other Unix-based systems work under the hood. Busybox applets are pretty independent from each other, so you can just take one and focus on it specifically, digging into the common library code if you need to. The utilities aren't as fully-featured as their GNU coreutils counterparts, which makes it easier to understand what they actually do. Busybox is not a toy project by any means, though, it's used in many embedded devices and leaner Linux distributions, Alpine being the prime example. However, it's written in some pretty dense C, with a fair bit of pointer magic involved, so if you don't understand things like the equivalence between a pointer and the beginning of an array, some things might not make sense.
2. The Go standard library. Unlike many programming languages, Go does not rely on much external code. Whereas Python delegates zip handling to zlib, handling of TLS connections to Openssl and so on, Go just includes all of this in the standard library, and it's all pretty readable Go code. If you want to understand many common algorithms or file formats, everything from sorting arrays to parsing JSON to sending and receiving HTTP requests or common cryptographic operations, all written in a readable style, in a language much higher level than C, just look at the Go stdlib. Go even fully implements everything needed for its own compilation, including linkers and assemblers. I haven't read these parts much, and a lot of that code is transpiled from C, so I can't say how good the code quality is.
3. Serenity OS. It's a hobby Posix-based operating system written in C++, with no external dependencies, not even libc or libstdc++. They have their own homegrown implementations of every part of an operating system, from a monolithic kernel, to common Unix utilities, archive handling, audio and video codecs, common data structures, like vectors (growable arrays), hash maps, locks, mutexes and other concurrency primitives, a custom string implementation, a window server and a GUI library, including a fully-featured event loop system, their own window manager and many common GUI widgets, to actual applications and games. They even have a custom web browser with a custom web engine and JS interpreter. As a rule of thumb, if something it's either in Busybox or in the Go standard library, there's a good chance it will also be in Serenity. Again, their utilities do much less than their non-serenity counterparts and are far less optimized, but that also means there's a lot fewer layers of abstraction to deal with and that the general principles underlying their implementation are actually easier to understand. The fact that everything is in a single repo, neatly organized, written in one language with common conventions, just makes it really pleasant to read. Their code quality isn't always the best, but the fact that it's C++ and not C does make things easier. Even though I haven't actually used the OS (because of accessibility concerns), it's one of these repos that I always have cloned on my computer, and it's the first place I look if I'm curious how a particular feature or app can be implemented.
4. If you're in any way interested in AI, everything written by Andrej Karpathy, notably Micrograd and Nano GPT. There's also Tinygrad, a bigger but still understandable take on Micrograd. Unless you're an expert, you need to watch Andrej's Youtube videos to actually understand the code, but the feeling I got when I actually understood the principle behind Micrograd is one I will never forget. I consider it to be the most beautiful piece of code I've ever seen, it basically embodies the whole principle of what a neural net is in 200 lines of code. Everything else that the big libraries do is basically just implementations of actual models, optimization and glue code, such as for loading data and such. It's often crucial optimization, optimization without which modern neural networks wouldn't be possible at all, but just optimization nonetheless.
5. Everything concerning Elixir, both the standard library, other libraries written in it, as well as open-source Phoenix web apps. Deep down, it's basically a Lisp without the off-putting parentheses. There are a lot of lessons to be learnt there, from the power of macros and the fact that things like "if" can be written in the language itself instead of being a special construct, to the power of pattern matching and the pipeline operator, to the advantages of its concurrency model and functional programming in general.
To generalize beyond these specific examples, if you want to understand something, find a smaller version of it, and try understanding that. The smaller version might just be a git commit from a good few years ago, with much fewer features, but it's better if it is a different, more basic (but preferably not toy) implementation of the same app, feature or algorithm. Don't read V8, PyTorch or Postgres, read Lua, Tinygrad or Sqlite instead.
Eventually, after working on, say, half a dozen code bases, you'll start to understand intuitively what good code is, providing you get lucky enough to find a good code base, or a code base with a significant amount of good code.
It's a long old journey, but once you have the skill, it never goes away. It's like learning a musical instrument or a foreign language. (By which I mean you can read as many books as you like about it, but without application, you haven't yet begun. Nevertheless, read the books.)
Warning: most developers never attain this skill, but almost all of them believe – truly believe – that they write good code; just as everyone thinks they are a good – nay excellent – driver.
Warning: no one writes good code. Good code becomes good through iteration, just as good writing becomes good by iteration/editing. The reason for this is obvious; but if you don't know why, then you haven't done enough yet.
Warning: everyone has biases. Learn to recognise yours and when you are applying them. Learn to ignore them and see things through a different lens. Explore with an open mind.
Iterating to good code is one of the most satisfying things you can do with software development.
Sqlite is supposedly high quality C code: https://github.com/smparkes/sqlite
For videos of someone (Casey Muratori) writing video game code and debugging it, Handmade Hero: https://handmadehero.org/
A blog post about how to write code by the same author: https://caseymuratori.com/blog_0015
For how to implement a fairly advanced type system, Typing Haskell in Haskell: https://gist.github.com/chrisdone/0075a16b32bfd4f62b7b
But, honestly, you're probably better off writing code yourself and learning by doing.
I've basically never been blown away by someone's function definitions or whatever, but I regularly run into well designed APIs.
Maybe it's because the API is the thing that has the ergonomics, but the code is just getting things to work. Not sure.
Gitlab, Bitbucket.
I don't really know what to expect. You want to see good production code, and the FOSS community is way way WAY better at this than some of the bubble gum you'd see in a professional setting.
Your question is too general, so I can't exactly give you a specific repository. I could direct you to BGFX[1] for a decent architecture of a cross platform renderer, but if you're not a graphics programmer, that may be a bad exercise, as you'd spend more time learning jargon than studying clean code. Or it uses patterns (or lack of, given graphics programming) that don't apply to your domain.
The problem with with "good code" is that it tends to be code that -- more often than not -- is written by people I wouldn't trust to work on real-world large-scale software development.
On the other hand, any sufficiently large project that was successfully delivered necessarily contains almost exclusively "good code", because if it didn't, it would have collapsed under the overbearing weight of "bad code".
What makes the Android source code interesting reading:
- It's written by some of the best software engineers in the business, by any imaginable standard.
- It is an unimaginably successful project.
- It is codes that deals with the gritty enduring reality of programming in the large that cannot reasonably be addressed by toy samples of "good code". That's where real "good code" lives.
- As a programmer, that's the kind of scale I want to work on. Preferably as one of the principle engineers working on Android 1.0. But Android 14 wouldn't be awful.
Overwhelmingly, it is exceptionally good code. Occasionally it is less than happy code. But the places where it are less than happy are almost more interesting than the places where it's good.
The code is 14 years old. It's been through 14 major releases (34 minor releases). It started on phones with 320x200 displays, with megabytes of memory, and processors than could barely run a toaster. And now it runs on phones with 4k displays with 8GB of memory, on processors that are about 2.4 kilo-Crays.
If you're a junior programmer, every single line is better than what you're capable of writing. If you're a senior or intermediate programmer, there is serious food for contemplation. The question that should be asked at every turn: if I was on the Android 1.0 development team, what could I have done to make this happier code?
And I'd really like to see what the "Less code is better code" guru (a recent HN posting) could do with androidx/fragment/app/Fragment.java and friends. A perfect example of a "Good Code Guru" that I would not trust to work on any of my projects.
---
There is no such thing as bad code; but some code is happier than other code.
-- Herbie Hancock.
Or what Herbie Hancock would have said if he were a programmer instead of a jazz musician.