- "The Linux Programming Interface"
- "Systems Programming with Linux"
- "Adavnaced UNIX Programming"
What I struggle with: How to get exposure to projects to learn for a future job? I had a Rust job for around half a year, where people build web servers and came from a C and C++ background. Half of the stuff they wrote I didn't understand (flushing, opening another channel just for logs so we don't fill up the other ones etc. etc.).
Now I wonder how I can get access to this type of information, how to properly learn it?
It's something I was terrified of doing.
But once I got in there and started poking around, I realized it was just ordinary plain-vanilla C code. Not C++. Just C code.
With my local copy, I started to hack pg_dump to do something special that we wanted at the time. Even after 30 years of coding, I'm not that especially good of a programmer. But I ended up getting our own special version of pg_dump that did what we wanted at the time and it went into production dumping hundreds of gigs of data every day!
But what I'm not, is afraid. I'm not afraid to try anything.
And that's what it takes to do deep, systems level programming.
Don't be afraid.
Those bits are just bits. And it's just code... and most of it was not written by wizards. Just ordinary people like you and me. Don't be afraid man.
Clone the repo and setup a workable build environment and start tinkering and compiling and running to see what happens.
You would be totally shocked to find out what you can actually achieve.
Those books will only go so far.
- One system in isolation - Operating Systems: Three Easy Pieces. Covers persistence, virtualisation and concurrency. This book is available for free at https://pages.cs.wisc.edu/~remzi/OSTEP/
- Multiple systems, and how data flows through them - Designing Data Intensive Applications. Covers the low level details of how databases persist data to disk and how multiple nodes coordinate with each other. If you’ve heard of the “CAP theorem”, this is the source to learn it from. Worth every penny.
More on why these two books are worth reading at https://teachyourselfcs.com
It is easy to get started. Here's a system call function for x86_64 Linux:
https://github.com/matheusmoreira/liblinux/blob/master/sourc...
With this single function, it is possible to do anything. You can rewrite the entire Linux user space with this.
The Linux kernel itself has a nolibc.h file that they use for their own freestanding tools:
https://github.com/torvalds/linux/blob/master/tools/include/...
It's even better than what I came up with. Permissively licensed. Supports lots of architectures. There's process startup code so you don't even need GCC's startfiles in order to have a normal main function. The only missing feature is the auxiliary vector. You can just include it and start using Linux immediately.
You can absolutely do this with Rust as well. Rust supports programming with no standard library. If I remember correctly, the inline assembly macros are still an unstable feature. It's likely to change in the future though.
From there, you'll know where to go next based on what you've learned so far.
My recommended reading list is:
[1] Operating Systems: Three Easy Pieces https://pages.cs.wisc.edu/~remzi/OSTEP
[2] Intel Software Developer Manuals (especially volume 1 and 3A) https://software.intel.com/content/www/us/en/develop/article...
[3] OSDev wiki https://wiki.osdev.org
Most existing systems programming is in C/C++. Rust is new and there isn't much battle hardened code out there.
Since you're looking at rust, https://www.youtube.com/user/gamozolabs/videos could be a good fit.
1. https://pdos.csail.mit.edu/6.S081/2020/schedule.html (check video links and do all the lab assignments)
2. https://www.youtube.com/watch?v=dEWplhfg4Dg&list=PLf3ZkSCyj1... (based on old MIT 6.828)
3. Networks Review (self study)
Somebody mentioned Intel manuals in the comment (those are really helpful)
These courses helped alot. I would suggest to take some firmware and device driver development course as well. My conclusion is that, these are tough skills and the learning process can be accelerated if you can find some entry level position in a small company which do this kind of work.
You've found some pretty good resources already by the looks of it though.
Although the language you use to learn system & network programming doesn't matter much, it is better if you use C or C++ to practise and learn. This is because the kernel itself is written in C and exposes system calls that can be used directly from a C/C++ program. That said, "The Linux Programming Interface"(I am personally reading it) is a really good book. It talks a lot about how one should go about using system calls to get things done by the kernel. Make sure to read a little every day and try out the examples by writing C/C++ programs.
I recently realized that TLPI doesn't talk much about why are things the way they are(a very good example would be virtual memory and related stuff). You should refer some theoretical book for this. I suggest you go with "Operating systems" by Deitel & Choffnes.
Read man pages and practise using the libc/kernel APIs. For example, if you want to know about flushing, read 'man 3 fflush'. This might be needed when you want to flush all the input/output data that has been buffered by the C library before you can get fresh input from stdin. For example, if prompts are buffered, you definitely don't want to "scanf" before you have flushed the buffers. If you want to learn network programming, read chapters related to socket and refer 'man 2 socket'.
You will eventually get to a point where you will be able to connect all the dots(APIs) and be able to figure out what exactly you will need to get some problem solved.
Finally, don't learn for a future job. Learn for yourself. This will help you in the long run.
At that time, we had the option to work with MINIX. Here are the MINIX Role-based Access Control and Firewall Labs:
https://web.ecs.syr.edu/~wedu/seed/Labs_12.04/System/RBAC_Ca... https://web.ecs.syr.edu/~wedu/seed/Labs_12.04/Networking/Fir...
Professor Du's materials are also packaged for self-learners and other teachers to use, as the open source SEED project. A few of the current SEED projects are implementation-exercises similar to the above two labs.
I highly recommend the above resources.
https://archive.org/details/pcinternsystempr0000tisc
Granted they are probably too old, but the concepts of what is actually systems programming is there, you can then get hold of an Arduino or Rasperry PI like device and do your own little OS or bare metal game,
https://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/o...
http://www.science.smith.edu/dftwiki/index.php/Tutorial:_Ass...
Or maybe trying your hand at compilers with https://www.nand2tetris.org or given the similarities of Rust with ML, maybe dive into Tiger Book (https://www.cs.princeton.edu/~appel/modern/ml).
Or maybe
https://www.manning.com/books/rust-in-action
https://www.apress.com/gp/book/9781484258590 (Rust for IoT)
Later this year we're having our third conference [1], so it could prove useful* to meet up with systems programmers there.
* The usual self-plug warning (I organize these things.)
* Low-level languages, compiler runtimes, toolchains and libraries.
* OS system call apis.
* Concurrency.
* Networking.
You need books/papers which will teach and walk you through sample idioms and applications in the above domains. A prerequisite is fluency in the C language. With that in mind the following are recommended (some are old books which you can buy used and cheap wherever possible);* Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron.
* The C Companion by Allen Holub. A gem of an oldie.
* ELF: From the Programmer's Perspective; a paper by Hongjiu Lu.
* UNIX Network Programming by Richard Stevens. Initially, get the old 1st edition since it contains TCP/IP, IPC etc. all in one volume.
* Advanced UNIX Programming by Marc Rochkind.
* Advanced programming in the UNIX Environment by Richard Stevens.
* UNIX Systems Programming by Robbins and Robbins.
* Programming for the Real World, POSIX.4 by Bill Gallmeister.
I've tried to learn C probably 4 times now and I just don't like it. But then I came across LuaJIT ffi which, very easily allows you to use whatever shared library and call whatever syscalls directly and that was a game changer!
After that I decided to test ziglang, which a big part of it's design decisions is interoperability with C, and I'm in love with it! It really feels like anything I would need C for I can do in zig.
If Rust if your jam find a way to call the linux syscalls directly from Rust, not just using a cargo library, but actually importing the appropriate headers and successfully doing an epoll or something.
It will feel like suddenly the man (3) pages all make sense and are extremlly useful!
Good luck!
I think it's worth mentioning that the majority of topics involved in this type of webserver work wouldn't necessarily be covered by a systems programming book. There's obviously overlap between all of systems programming, networking, and concurrent/distributed systems, but if you plan to focus on web servers, I'd pick up texts on the other topics as well.
Modern Operating Systems by Tanenbaum is a good theory book - this will probably answer your questions about flushing etc
for down and dirty:
Advanced Programming in the UNIX Environment
TCP/IP Illustrated, Volume 1 (2 and 3)
Three things helped me a lot to learn more about systems programming:
(1) the reading of existing systems code, especially (i) from a book called Dr Dobb's C-Tools, which includes a C compiler, assembler and linker as well as many command line tools and (ii) the Minix source code. It was the code in this book rather than K&R or Stevens that let me "get" systems programming because I needed to see the bigger picture, and many books only show small code snippets.
(2) the study of other people code; if yo; au have access to a C guru, it's really helpful to just peek over their shoulders for a couple of hours as they implement a new module and then debug it (thanks, Gero and Rolf!) - thankfully, there is a new trend of people recording coding sessions and putting them on YouTube, so more people out there can benefit from experienced hackers e.g. https://www.youtube.com/watch?v=1-7VQwWo2Tg . And, of course,
(3) implementing a non-trivial low-level component. For me, this was having to implement the buffer management of a relational database management system in C from scratch as an exercise in my undergraduate degree (we were given 6 weeks, but not full-time, as lectures were going on at the same time). This course, Systems Programming II, was as beneficial as it was gruesome, but I'm grateful to excellent line-by-line pencilled feedback of one tutor that read the complete code and commented every missing return value check etc.
- Release It! (https://pragprog.com/titles/mnee2/release-it-second-edition/)
- Designing Data-Intensive Applications (https://dataintensive.net/)
I would suggest finding an open source project of interest and taking a deep dive into its code and documentation to understand how it works and why it was built that way.
Which reminds me, this should help with that: The Architecture of Open Source Applications (http://www.aosabook.org/en/index.html)
Advantages:
1. Problem will be well defined for you.
2. Better to implement eventual consistency between 3 nodes, distributed file system or single user database than trying to figure a bug in a large open source codebase.
You may find following links helpful in finding some of such courses:
The "Operating Systems - Three Easy Pieces" is one great book that has already been mentioned. I would also suggest "Computer Systems - A Programmer's Perspective" along the same lines (https://csapp.cs.cmu.edu/).
Computer Networking is another field you're likely to run into. "Computer Networks: A Systems Approach" is a good book (https://book.systemsapproach.org/)
Start with a book or a few texts/tutorials on the subject and begin building. Along the way you'll find the questions and choices involved. Find the answers from the internet or books. This is when I'd recommend some open source projects (not earlier) to see how they solved the specific problems. If you just go into an open-source project you won't have an understanding of the problem, just the answers, so it won't help you nearly as much.
(And by the way, thanks OP for asking this question! I’ve also been wanting to learn systems programming, but I haven’t gotten around to asking yet. And all the suggested resources look fascinating… there goes my university vacation!)
You've already got half the solution... "Advanced UNIX Programming" was the goto since it was written, and it has a tonne of industry knowledge e.g fork twice and close handles etc.
"The Linux Programming Interface" is kind of that book for the 21st century, and covers pretty much any topic you'll ever encounter on the systems end.
> Half of the stuff they wrote I didn't understand
Sounds like you bought the books but never read them... because it's all in there.
If anyone here wants a quality PDF of either, I can make that happen, I digitize textbooks as a hobby. I bought both books new from Amazon and were well worth the price new.
This will teach you what Rust is saving you from and help you understand how it does it.
I would suggest some kind of performance oriented project, where syscalls costs and concurrency issues matter? This will give you a 'feel' for what's going on, and why what's available is available.
Finally, I suggest reading lwn, which will give you a very good idea of what's happening in the kernel:
Also, make the case that many things that you learn while doing frontend programming like asynchronous programming are skills that will port over just fine.
[1] Operating Systems Foundations with Linux on the Raspberry Pi: Textbook
[2] Systems Performance (2nd Edition)
Make sure your desk is clean. Get yourself a nice big pad of dotted paper.
If not, what is a channel?
* C/C++
Learn C and it's standard libs (stdio etc..), if you haven't already. Choose a good book for this because many tutorials etc.. you find online are pretty incomplete. Then read one of your books about UNIX APIs.
Also it's worth learning how to use a C debugger (gdb or visual studio one).
* OS
You can look at linux source code but it would be daunting. I'd suggest starting with a Teaching OS and accompanying book, eg xv6 or minix.
Try doing modifications in it. toy OSes generally come with such exercises.
* Assembly language
Learning C doesn't give a perfect idea how machines work. Learn X86_64 assembly programming. you can inspect what assmebly output your C programs give using godbolt's compiler explorer website. Assembly is little boring so don't try to memorize instructions. The mental model is important thing.
* Basics of algorithms and data structures
Maybe you already know, because you worked in back end. But if you're not familiar with few data structures like hash tables, b trees etc.. it might be worth familiarising yourself.
Some miscellaneous topics you might get interested in: linkers / executable formats, OS level stuff related to computer networks, multi threading, file systems, SIMD / vector instructions in assembly.
As others said, write some code when learning. You don't need to do entire projects yourself, you can also play around with established projects.
I don't think this is system programming rather network programming.
You should try embedded or try to write drivers
Better yet write a mini OS your own that interface with hardware
Edit: Wow, I'm getting down voted. I guess political correctness