Going low-level – what to learn next?

Question

I've been interested in (relatively) low-level programming for quite some time. I know C, I can read assembly (write - not so much) and I have basic understanding of CPU (registers, cache, stack and whatnot). What should I read or learn next to be able to reason about things like cache usage of a C program or possible low-level microoptimizations? It's still just a hobby for me for the most part, so it's not that I'm trying to prematurely optimize some production code. I'm just interested in how it all works under the hood.

sshine · Accepted Answer

If you want to "go low" in the operating system, you could write a Linux kernel module:
https://sysprog21.github.io/lkmpg/
https://blog.sourcerer.io/writing-a-simple-linux-kernel-modu...
If you want to "go low" in the way hardware works, you could try and write an interrupt handler on an embedded device.
If you want to "go low" in how optimizations work in application development, you could try and implement microbenchmarks and look at flamegraphs.
https://github.com/brendangregg/FlameGraph
https://bheisler.github.io/criterion.rs/book/index.html

kjuulh · Answer

If you want something on the more application side, I.e. utilization of low-level techniques, I can highly recommend looking into game engine development, even if you aren't interested in games per say.
I've recently enjoyed Game Engine Architecture, mostly because there is an interesting mix using low level techniques to solve problems a normal application wouldn't be required to fix.
Game development in general is case for tuning yourself to when you should utilize high level programming techniques, and when required dropping into low level optimization to solve local problems.

boffinAudio · Answer

What you're basically after is some "Tooling and Methodology" studies for embedded. You've got the basics, but now you need to learn some more tools and the methods that make those tools really useful to the embedded/low-level context.
Some simple things you can do:
* Get yourself a suitable embedded development system - I would recommend anything ESP32'ish that suits your fancy such as a Liligo or Watchy ESP32-based watch, or PineTime if thats more up your alley - and then write some little apps for it.
* Get to know Godbolt with a great deal of intimacy, just as a general approach to understanding what is going on:
https://godbolt.org/
* Invest a little workbench time in some of the various embedded frameworks out there - platformio, FreeRTOS, etc. and, very important: learn the Tooling And Methodology techniques that these frameworks manifest.
* Invest some workbench time in the RETRO Computing Scene. Seriously, you can still learn extremely valuable principles of tooling and methodology from an 8-bit retro system from the 80's. Get your favourite platform, get all its tools onboard, engage with its community - you will learn a lot of things that are still entirely relevant, in spite of the changes over the decades.
* Get into the F/OSS tooling/methdology flow - find software projects that are interesting to you, find their repositories, learn to clone and build and test locally, and so on. There are so many fantastic projects out there for which low-level skills can be developed/fostered. Get onboard with something that interests you.
Good luck!

mandliya · Answer

CUDA programming (writing CUDA kernels) might be a good direction too.
GPU race is getting really hot and there is a lot of work being done to squeeze every ounce of performance especially for LLM training and inference.
One resource I would recommend is “Programming massively parallel processors” [1]
I am also learning it as my hobby project and uploading my notes here [2]
[1] https://shop.elsevier.com/books/programming-massively-parall...
[2] https://github.com/mandliya/PMPP_notes

abnercoimbre · Answer

Join the Handmade Network [0]. Much can be learned by hanging out with the right kind of competent/enthusiastic crowd. And if you allow the plug, I run the conferences [1].
[0] https://handmade.network
[1] https://handmadecities.com

mikewarot · Answer

It's nowhere near all the way down to the bottom of the stack*, but it is a lot closer... try out Nand2Tetris[1], where you start with a fairly low level construct... the NAND logic gate, and work your way up to a CPU, Memory, and make a virtual PC, then write an assembler and programs for it.[1] https://www.nand2tetris.org/* - Lower levels include transistor logic, analog electronics, electromagnetism, chemistry and the equilibrium equation (how transistors work), quantum mechanics (how atoms and chemistry works).

lelanthran · Answer

Low-level means different things to different people:
1. The best source of low-level information on things like operating systems (writing your own) etc is https://wiki.osdev.org/Expanded_Main_Page
2. Compiler related low-level should include a read through Crafting Interpreters (https://craftinginterpreters.com/), even if all you're going to do is create compiled languages.
3. Hardware type low-level (where you build your own logic) is a long and ultimately non-rewarding path. I would suggest https://eater.net/8bit/
All those links are only starting points; use them to find a direction in which to head.
[EDIT: I also recommend writing a eBPF module for linux - easier than writing a kernel module, with just as much low-level hooks as you might need].

junon · Answer

Game engine development or going even lower and dabbling in embedded. Embedded in particular helped me to understand computers way more in depth than I would have just sticking even to application-side C.Also, learn Rust.

koliber · Answer

Implement some none-trivial algorithm and then try to optimize the heck out of it. Focus on the concrete CPU, architecture, cache, and memory in the hardware you have in front of you. Forget portability for a moment.
This comes to mind: https://www.morling.dev/blog/one-billion-row-challenge/
Read how others have done it. Here's an example in Java that goes relatively low-level to squeeze out performance: https://questdb.io/blog/billion-row-challenge-step-by-step/

sph · Answer

Maybe side-step into the most fun hobby in low-level computing: OS development.
https://wiki.osdev.org/Main_Page
It will give you a much more holistic view of computers-as-hardware and the low-level intricacies, that are in my opinion more useful and more foundational than just being good at optimising a hot-loop.
I did it in my teenage years, and it's my first true and only love. Now almost 20 years later I'm back at it, this time with all the accumulated experience in software engineering. There is nothing quite like it. Any basic, trite design (i.e. the usual POSIX clone) will teach you a great deal about the entire stack.

marckerbiquet · Answer

Agner Fog's optimization manuals https://agner.org/optimize/ are x64 oriented but are a very valuable resource for C++ and assembly programming.

cladopa · Answer

What do you want low programming for?I learned assembly so I could disassemble and understand programs.I learned C so I could use all the libraries that people had made and their frameworks than later because C++, Objective C, C#, java, python and other derivatives.I wanted to manipulate images, speech and video and using high level programming language was so inefficient so I continued using C.I learned FPFGAs again because I needed efficiency or the things I wanted to do like controlling robots did not work at all(they moved so sluggishly).I love learning things, but that was never enough for me to learn something deeply when problems appear.

afr0ck · Answer

Git pull the Linux kernel source tree (with lxr or cscope for code navigation). Open the Linux kernel mailing list, get some patch you're interested in (e.g. memory management), apply it to your tree, study the patch and participate in solving subsequent/related problems.

zamalek · Answer

Arduino/embedded, but forego the "drivers" - interact with the registers directly. I recommend anything RP2040-based (the Pi Pico is the first-party option) because the datasheet stands completely alone in terms of quality.

aetimmes · Answer

Read through Understanding Software Dynamics by Richard Sites and work through the example problems in the book.
He discusses exactly what you're describing (L1/2/3 cache hit rates, their performance implications, how compiler optimizations can fool us into thinking we have a good hit rate, etc).
Also take a look into Intel VTune and Processor Tracing to understand how performance counters like Instructions per Cycle are calculated.

muzani · Answer

I'm surprised nobody has mentioned mobile yet. While native mobile works through the Android/iOS API, it's often quite close to the minimum abstraction level.
Here's a doc you can dig around with https://source.android.com/docs/core
You get a high level overview, then it explains how everything connects right down to the hardware. It's open source too, so you can go in there and poke around.
If you want something that can make money, I'd say look at camera and Bluetooth, because these are the things that need the most customization. Neural network API could see a lot of use in the future too.
But there's plenty of fascinating stuff, like how it renders fonts, how it handles hearing aids, and so on.
Edit: TIL Android has a category of 'rich haptics', where it gives tiny haptic feedback when you swipe your finger across a surface or to the beat of music. Very few app devs know this, so it's not integrates well into apps.

Brightwise · Answer

I would recommend either to build up some RTOS/OS knowledge or excercise your skills in Low-levelish game programming with maybe C and SDL. I found especially these two very rewarding and I'm also making a living on the former for 15 years.https://wiki.osdev.org/ is a good source for getting a hold in OS development

jiggawatts · Answer

https://godbolt.org/Not only will it show you how C/C++/Rust, etc... language statements map to CPU instructions, but it can also show you how CPUs execute those instructions! There are advanced views that show the various pipeline stages, execution ports, etc...E.g.: https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename...The right-most tab should show you the CPU execution pipeline

1ark · Answer

I love Kip Irvine's book on x86 assembly. Very practical for building small programs, with questions and exercises. It is in the Windows environment though, but I don't think it matters significantly much, instructions are the same after all.http://asmirvine.com/index.htm

azubinski · Answer

Low level is mostly not about processing core even if you are using assembly language. Yes, you can reach guru level in assembly language of some target platform but to do something really useful at the lower level you need to know and be able to use the peripheral.
And the hardest things are in peripheral.
Any modern microcontroller it will give you the opportunity to learn a lot of useful things about peripheral devices. You can start with any really good modern 8-bits micro like Microchip's 2nd generation of ATTiny, so you'll have in your hands a lot of very powerful interesting smart peripherals: hardware event system, small programmable logic, different timers, good ADC etc.
The only rational additional consideration here is that your target platform should be popular, well documented and supported by the manufacturer.
Then will be time for some Cortex-M0 device with DMA.
Then you'll decide where to go further :)

rwmj · Answer

FPGA + a RISC-V core. There are loads around, the simplest is probably https://github.com/YosysHQ/picorv32Learn how it works, try adding a new instruction or implementing an extension.

jeffrallen · Answer

Go to oswiki and find out how to use an off the shelf real-mode boot loader to get qemu-system-x86_64 to boot into some code you write that adds 1 and 1 and leaves it in the A register. Find out how to use qemu to single step and see your 2 in the register.Then write your own OS.

pietmichal · Answer

Give Performance-Aware Programming series by Casey Muratori a try!https://www.computerenhance.com/p/table-of-contents

pabs3 · Answer

Look at how to bootstrap a compiler, kernel etc from source when you don't have any binaries for compilers, kernels etc.https://bootstrappable.org/ https://github.com/fosslinux/live-bootstrap/ https://bootstrapping.miraheze.org/wiki/Stage0

dprophecyguy · Answer

Here is the chronologically laid out resources i'll recommend.
Start with this - https://bottomupcs.com/
Then do this - https://www.youtube.com/playlist?list=PLhy9gU5W1fvUND_5mdpbN...
and finally this https://diveintosystems.org/book/introduction.html

4pkjai · Answer

If you like game progrmaming I recommend some of the low level C courses by Pikuma. I've done the 3D graphics from scratch course and will do the Playstation 1 Programming course next.

camgunz · Answer

I'm a little behind where you are (bad assembly knowledge) and I've always thought my next steps were learning about SPI and I2C, if that helps.

kfreds · Answer

> I'm just interested in how it all works under the hood.
Learn everything there is to learn about the Tillitis TKey. It's the most open-source software and hardware USB security token there is. It is FPGA-based, and contains a tiny RISC-V core.
Full disclosure: I'm involved in the project.

ReleaseCandidat · Answer

> What should I read or learn next to be able to reason about things like cache usage of a C program or possible low-level microoptimizations?Learn how to use a profiler like Linux' perf, VTune or Apple's Instruments. Which means interpreting the results of it to optimise your code.

greytape · Answer

https://csprimer.com/

gauravkumar37 · Answer

https://cpu.land/

petabyt · Answer

Other than osdev, reverse engineering software helped my assembly skills tremendously. I would start with reverse engineering some desktop software or dll in Ghidra to learn how it works.

ydnaclementine · Answer

Thanks for asking this. I was thinking very similarly after reading comments in the post about the new AMD processors yesterday and not understanding most of them, and another about some networking stuff

moeadham · Answer

Go all the way down, grab an FPGA dev board and learn Verilog/VHDL!

alexdowad · Answer

From your description of your current level, reading Patterson's "Computer Architecture: a Quantitative Approach" would be a next good step.

sakex · Answer

An emerging (not-so) low-level programming discipline is AI/ML. So you could try implementing decoder only transformers using CUDA.

shivc · Answer

Might not exactly be very low level but I really liked the learn c programming by dr. chuck youtube course on freecodecamp youtube

exe34 · Answer

my approach is always to pick an idea for a project that I care about and go full autistic not-invented-here in implementing it. i never finish such things, nothing I'd show/publish, but by the time I'm sick of it, I've learnt a huge amount.

keyle · Answer

RTOS, robotics, virtual machines, bare metal games. Or move on to the GPU and pick up CUDA?

zabzonk · Answer

to be honest, i think you have it all there - now write something useful using it

anonymoushn · Answer

Proceed to highload.fun. Read the resources in the highload.fun wiki.

blankx32 · Answer

Read Petzolds book Code Read NAND 2 Tetris book

082349872349872 · Answer

Have you tried -nostdlib yet?

latenightcoding · Answer

an optimizing Lua or Lisp compiler

hnaccountme · Answer

Look into bfp tools

Going low-level – what to learn next?

Join the Handmade Network [0]. Much can be learned by hanging out with the right kind of competent/enthusiastic crowd. And if you allow the plug, I run the conferences [1].
[0] https://handmade.network
[1] https://handmadecities.com

Game engine development or going even lower and dabbling in embedded. Embedded in particular helped me to understand computers way more in depth than I would have just sticking even to application-side C.
Also, learn Rust.

Agner Fog's optimization manuals https://agner.org/optimize/ are x64 oriented but are a very valuable resource for C++ and assembly programming.

Git pull the Linux kernel source tree (with lxr or cscope for code navigation). Open the Linux kernel mailing list, get some patch you're interested in (e.g. memory management), apply it to your tree, study the patch and participate in solving subsequent/related problems.

Arduino/embedded, but forego the "drivers" - interact with the registers directly. I recommend anything RP2040-based (the Pi Pico is the first-party option) because the datasheet stands completely alone in terms of quality.

I love Kip Irvine's book on x86 assembly. Very practical for building small programs, with questions and exercises. It is in the Windows environment though, but I don't think it matters significantly much, instructions are the same after all.
http://asmirvine.com/index.htm

FPGA + a RISC-V core. There are loads around, the simplest is probably https://github.com/YosysHQ/picorv32
Learn how it works, try adding a new instruction or implementing an extension.

Go to oswiki and find out how to use an off the shelf real-mode boot loader to get qemu-system-x86_64 to boot into some code you write that adds 1 and 1 and leaves it in the A register. Find out how to use qemu to single step and see your 2 in the register.
Then write your own OS.

Give Performance-Aware Programming series by Casey Muratori a try!
https://www.computerenhance.com/p/table-of-contents

Look at how to bootstrap a compiler, kernel etc from source when you don't have any binaries for compilers, kernels etc.
https://bootstrappable.org/ https://github.com/fosslinux/live-bootstrap/ https://bootstrapping.miraheze.org/wiki/Stage0

Here is the chronologically laid out resources i'll recommend.
Start with this - https://bottomupcs.com/
Then do this - https://www.youtube.com/playlist?list=PLhy9gU5W1fvUND_5mdpbN...
and finally this https://diveintosystems.org/book/introduction.html

If you like game progrmaming I recommend some of the low level C courses by Pikuma. I've done the 3D graphics from scratch course and will do the Playstation 1 Programming course next.

I'm a little behind where you are (bad assembly knowledge) and I've always thought my next steps were learning about SPI and I2C, if that helps.

> I'm just interested in how it all works under the hood.
Learn everything there is to learn about the Tillitis TKey. It's the most open-source software and hardware USB security token there is. It is FPGA-based, and contains a tiny RISC-V core.
Full disclosure: I'm involved in the project.

> What should I read or learn next to be able to reason about things like cache usage of a C program or possible low-level microoptimizations?
Learn how to use a profiler like Linux' perf, VTune or Apple's Instruments. Which means interpreting the results of it to optimise your code.

https://csprimer.com/

https://cpu.land/

Other than osdev, reverse engineering software helped my assembly skills tremendously. I would start with reverse engineering some desktop software or dll in Ghidra to learn how it works.

Thanks for asking this. I was thinking very similarly after reading comments in the post about the new AMD processors yesterday and not understanding most of them, and another about some networking stuff

Go all the way down, grab an FPGA dev board and learn Verilog/VHDL!

From your description of your current level, reading Patterson's "Computer Architecture: a Quantitative Approach" would be a next good step.

An emerging (not-so) low-level programming discipline is AI/ML. So you could try implementing decoder only transformers using CUDA.

Might not exactly be very low level but I really liked the learn c programming by dr. chuck youtube course on freecodecamp youtube

my approach is always to pick an idea for a project that I care about and go full autistic not-invented-here in implementing it. i never finish such things, nothing I'd show/publish, but by the time I'm sick of it, I've learnt a huge amount.

RTOS, robotics, virtual machines, bare metal games. Or move on to the GPU and pick up CUDA?

to be honest, i think you have it all there - now write something useful using it

Proceed to highload.fun. Read the resources in the highload.fun wiki.

Read Petzolds book Code Read NAND 2 Tetris book

Have you tried -nostdlib yet?

an optimizing Lua or Lisp compiler

Look into bfp tools

Going low-level – what to learn next?

Join the Handmade Network [0]. Much can be learned by hanging out with the right kind of competent/enthusiastic crowd. And if you allow the plug, I run the conferences [1].[0] https://handmade.network[1] https://handmadecities.com

Game engine development or going even lower and dabbling in embedded. Embedded in particular helped me to understand computers way more in depth than I would have just sticking even to application-side C.Also, learn Rust.

Agner Fog's optimization manuals https://agner.org/optimize/ are x64 oriented but are a very valuable resource for C++ and assembly programming.

Git pull the Linux kernel source tree (with lxr or cscope for code navigation). Open the Linux kernel mailing list, get some patch you're interested in (e.g. memory management), apply it to your tree, study the patch and participate in solving subsequent/related problems.

Arduino/embedded, but forego the "drivers" - interact with the registers directly. I recommend anything RP2040-based (the Pi Pico is the first-party option) because the datasheet stands completely alone in terms of quality.

I love Kip Irvine's book on x86 assembly. Very practical for building small programs, with questions and exercises. It is in the Windows environment though, but I don't think it matters significantly much, instructions are the same after all.http://asmirvine.com/index.htm

FPGA + a RISC-V core. There are loads around, the simplest is probably https://github.com/YosysHQ/picorv32Learn how it works, try adding a new instruction or implementing an extension.

Go to oswiki and find out how to use an off the shelf real-mode boot loader to get qemu-system-x86_64 to boot into some code you write that adds 1 and 1 and leaves it in the A register. Find out how to use qemu to single step and see your 2 in the register.Then write your own OS.

Give Performance-Aware Programming series by Casey Muratori a try!https://www.computerenhance.com/p/table-of-contents

Look at how to bootstrap a compiler, kernel etc from source when you don't have any binaries for compilers, kernels etc.https://bootstrappable.org/ https://github.com/fosslinux/live-bootstrap/ https://bootstrapping.miraheze.org/wiki/Stage0

Here is the chronologically laid out resources i'll recommend.Start with this - https://bottomupcs.com/Then do this - https://www.youtube.com/playlist?list=PLhy9gU5W1fvUND_5mdpbN...and finally this https://diveintosystems.org/book/introduction.html

If you like game progrmaming I recommend some of the low level C courses by Pikuma. I've done the 3D graphics from scratch course and will do the Playstation 1 Programming course next.

I'm a little behind where you are (bad assembly knowledge) and I've always thought my next steps were learning about SPI and I2C, if that helps.

> I'm just interested in how it all works under the hood.Learn everything there is to learn about the Tillitis TKey. It's the most open-source software and hardware USB security token there is. It is FPGA-based, and contains a tiny RISC-V core.Full disclosure: I'm involved in the project.

> What should I read or learn next to be able to reason about things like cache usage of a C program or possible low-level microoptimizations?Learn how to use a profiler like Linux' perf, VTune or Apple's Instruments. Which means interpreting the results of it to optimise your code.

https://csprimer.com/

https://cpu.land/

Other than osdev, reverse engineering software helped my assembly skills tremendously. I would start with reverse engineering some desktop software or dll in Ghidra to learn how it works.

Thanks for asking this. I was thinking very similarly after reading comments in the post about the new AMD processors yesterday and not understanding most of them, and another about some networking stuff

Go all the way down, grab an FPGA dev board and learn Verilog/VHDL!

From your description of your current level, reading Patterson's "Computer Architecture: a Quantitative Approach" would be a next good step.

An emerging (not-so) low-level programming discipline is AI/ML. So you could try implementing decoder only transformers using CUDA.

Might not exactly be very low level but I really liked the learn c programming by dr. chuck youtube course on freecodecamp youtube

my approach is always to pick an idea for a project that I care about and go full autistic not-invented-here in implementing it. i never finish such things, nothing I'd show/publish, but by the time I'm sick of it, I've learnt a huge amount.

RTOS, robotics, virtual machines, bare metal games. Or move on to the GPU and pick up CUDA?

to be honest, i think you have it all there - now write something useful using it

Proceed to highload.fun. Read the resources in the highload.fun wiki.

Read Petzolds book Code Read NAND 2 Tetris book

Have you tried -nostdlib yet?

an optimizing Lua or Lisp compiler

Look into bfp tools

Join the Handmade Network [0]. Much can be learned by hanging out with the right kind of competent/enthusiastic crowd. And if you allow the plug, I run the conferences [1].
[0] https://handmade.network
[1] https://handmadecities.com

Game engine development or going even lower and dabbling in embedded. Embedded in particular helped me to understand computers way more in depth than I would have just sticking even to application-side C.
Also, learn Rust.

I love Kip Irvine's book on x86 assembly. Very practical for building small programs, with questions and exercises. It is in the Windows environment though, but I don't think it matters significantly much, instructions are the same after all.
http://asmirvine.com/index.htm

FPGA + a RISC-V core. There are loads around, the simplest is probably https://github.com/YosysHQ/picorv32
Learn how it works, try adding a new instruction or implementing an extension.

Go to oswiki and find out how to use an off the shelf real-mode boot loader to get qemu-system-x86_64 to boot into some code you write that adds 1 and 1 and leaves it in the A register. Find out how to use qemu to single step and see your 2 in the register.
Then write your own OS.

Give Performance-Aware Programming series by Casey Muratori a try!
https://www.computerenhance.com/p/table-of-contents

Look at how to bootstrap a compiler, kernel etc from source when you don't have any binaries for compilers, kernels etc.
https://bootstrappable.org/ https://github.com/fosslinux/live-bootstrap/ https://bootstrapping.miraheze.org/wiki/Stage0

Here is the chronologically laid out resources i'll recommend.
Start with this - https://bottomupcs.com/
Then do this - https://www.youtube.com/playlist?list=PLhy9gU5W1fvUND_5mdpbN...
and finally this https://diveintosystems.org/book/introduction.html

> I'm just interested in how it all works under the hood.
Learn everything there is to learn about the Tillitis TKey. It's the most open-source software and hardware USB security token there is. It is FPGA-based, and contains a tiny RISC-V core.
Full disclosure: I'm involved in the project.

> What should I read or learn next to be able to reason about things like cache usage of a C program or possible low-level microoptimizations?
Learn how to use a profiler like Linux' perf, VTune or Apple's Instruments. Which means interpreting the results of it to optimise your code.