HACKER Q&A
📣 throwaway63467

Which books/resources to understand modern Assembler?


I’d like to learn more about Assembler in order to be able to work with LLVM and JIT as well as to write high performance low-level code. I’m familiar with the basics of x86 but I haven’t touched Assembler in a while, so I’m wondering which resources and in particular books you’d recommend?


  👤 jstrieb Accepted Answer ✓
Not specific to LLVM or JIT, but if you want a visceral intuition for the basics of ARM assembly, I made a free, online game at work (for mobile and desktop) that may help you:

https://ofrak.com/tetris/

I didn't do much ARM before working on the game, but since playing a lot, I'm very quick at reading disassembly, even for instructions not present in the game. It might help you to do the same – the timed game aspect forces you to learn to read the instructions quickly.

The game is like Tetris, but the blocks are ARM assembly instructions. As instructions fall, you can change the operand registers. Locking instructions into the .text section executes them in a CPU emulator running client-side in the browser, so you can immediately see the effects of every action. Your score is stored in memory at the address pointed to by one of the registers, so even though you earn points for each instruction executed without segfaulting, the true goal is to execute instructions that directly change the memory containing the score value.

When I released it a bit less than a year ago, I posted it to Hacker News as a Show HN:

https://news.ycombinator.com/item?id=37083309


👤 WalterBright
All you really need is an instruction set reference, such as: https://www.felixcloutier.com/x86/index.html and have a compiler that supports an inline assembler, like the D compiler, with Intel assembler syntax and use it like:

    private uint asmBitswap32(uint x) @trusted pure
    {
        asm pure nothrow @nogc { naked; }

        version (D_InlineAsm_X86_64)
        {
            version (Win64)
                asm pure nothrow @nogc { mov EAX, ECX; }
            else
                asm pure nothrow @nogc { mov EAX, EDI; }
        }

        asm pure nothrow @nogc
        {
            // Author: Tiago Gasiba.
            mov EDX, EAX;
            shr EAX, 1;
            and EDX, 0x5555_5555;
            and EAX, 0x5555_5555;
            shl EDX, 1;
            or  EAX, EDX;
            mov EDX, EAX;
            shr EAX, 2;
            and EDX, 0x3333_3333;
            and EAX, 0x3333_3333;
            shl EDX, 2;
            or  EAX, EDX;
            mov EDX, EAX;
            shr EAX, 4;
            and EDX, 0x0f0f_0f0f;
            and EAX, 0x0f0f_0f0f;
            shl EDX, 4;
            or  EAX, EDX;
            bswap EAX;
            ret;
        }
    }
The compiler will handle all the program setup and teardown, and you can just concentrate on the assembler part. You can also compile programs with the -vasm switch and the compiler will emit the asm corresponding to the code:

    int square(int x) { return x * x; }
compiling:

    dmd -c test.d -vasm
prints:

    _D4test6squareFiZi:
    0000:   0F AF C0                 imul      EAX,EAX
    0003:   C3                       ret
By trying simple expressions like `x * x` and looking at what the compiler generates, and looking at the instructions in the referenced link, you'll get the hang of it pretty quick.

👤 sargstuff
'Computer Architeture: A Quantitative Apporach" and/or more specific design types (mips, arm, etc) can be found under the Morgan Kaufmann Series in Computer Architeture and Design.

"Getting Started with LLVM Core Libraries: Get to Grips With Llvm Essentials and Use the Core Libraries to Build Advanced Tools "

"The Architecture of Open Source Applications (Volume 1) : LLVM" https://aosabook.org/en/v1/llvm.html

"Tourist Guide to LLVM source code" : https://blog.regehr.org/archives/1453

llvm home page : https://llvm.org/

llvm tutorial : https://llvm.org/docs/tutorial/

llvm reference : https://llvm.org/docs/LangRef.html

learn by examples : C source code to 'llvm' bitcode : https://stackoverflow.com/questions/9148890/how-to-make-clan...


👤 t-3
I like this book, it's just as good as The Art of Assembly Language, but much cheaper: https://rayseyfarth.com/asm/index.html

If you are interested in ARM or RISC-V assembly, the concepts are pretty similar but the instructions are different. For any architecture, you're going to have to read the architecture manuals to get a good working knowledge of the instructions and how to use them. An easy way to get started is to write a program in C, then replace the functions with assembly code one by one until your C code is just main() and a header.

ARMv7: https://developer.arm.com/documentation/100076/0200/a32-t32-...

ARMv8: https://developer.arm.com/documentation/ddi0602/2024-03/Base...

alternative: https://www.scs.stanford.edu/~zyedidia/arm64/

RISC-V: https://riscv.org/technical/specifications/

x86: https://www.intel.com/content/www/us/en/developer/articles/t...

web format: http://x86.dapsen.com/

If you like to learn by example (most of these are not great, but good enough to get started):

https://rosettacode.org/wiki/Assembly

https://github.com/TheAlgorithms/AArch64_Assembly


👤 pizlonator
How I learned:

Step #1: read the arch manual for some CPU. Read most if not all of it. It’s a lot of reading but it’s worth it. My first was PowerPC and my second was x86. By the time I got to arm, I only needed to use the manual as a reference. These days I would start with x86 because the manuals are well written and easily available. And the HW is easily available.

Step #2: compile small programs for that arch using GCC, clang, whatever and then dump disassembly and try to understand the correspondence between your code and the instructions.


👤 chc4
IMO You should just stick some programs in Ghidra/Godbolt and see what they emit, especially for small individual snippets whenever you think "I want to do X, what's the best way of doing it". There really isn't much difference between "baby's first assembly" program, where you just have movs and like five other common instructions, and the kind of assembly an optimizing compiler emits: it's a matter of recognizing that some operations can be merged into a more specialized one or the addressing mode of another, or you can use a setcc with a results flag from something you already computed, or what have you. The good code that LLVM and JITs emit for the most part aren't due to much better instruction selection but due to much better optimization passes, which learning more about assembly doesn't help with: it's about transforming code in general at a high level, which you would do at the compiler IR step before touching assembly at all.

👤 woadwarrior01
I'm currently reading Apple's "Apple Silicon CPU Optimization Guide"[1] and it's excellent! Very reminiscent of Intel's Software Developer Manuals[2], which I read a long time ago.

[1]: https://developer.apple.com/documentation/apple-silicon/cpu-...

[2]: https://www.intel.com/content/www/us/en/developer/articles/t...


👤 mtreis86
Play through the game Turing Complete, by the end you'll have built your own ISA and solved some puzzles with it. Keep playing for to get on the high scores list and you'll turn those assembly routines into ASICs.

👤 zoenolan
https://www.nand2tetris.org/

As a good refresher on assembly and compilers


👤 ksherlock
Someday -- not today, not tomorrow, but someday -- you'll probably want to read Agner Fog's optimization manuals.

https://www.agner.org/optimize/#manuals


👤 asalahli
I started with this excellent NASM tutorial[0] then went straight to Intel manuals.[1]

0. https://cs.lmu.edu/~ray/notes/nasmtutorial/

1. https://www.intel.com/content/www/us/en/developer/articles/t...


👤 joncmu
If you want to learn one of the oldest assembly languages you can still find a modern computer to run it on check out IBM Z assembly. There is a great list of resources here: https://idcp.marist.edu/assembler-resources

The one resource they don't list is the ISA manual which is called the Principles of Operation which the latest version can be found here: https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf

It is actually pretty amazing at how easy it is to learn other architectures once you understand how one or two work.


👤 CoastalCoder
I don't have a good book to suggest, but one tip you may find helpful:

A typical function has two kinds of assembly code:

(1) The ABI-required logic for functions and function calls, and

(2) Everything else, which can be more or less whatever you want. As long as you don't stomp on the details required by the ABI.


👤 anonymoushn
The highload.fun wiki[0] links some resources. The intel optimization manual[1] is also useful.

These resources are mostly aimed at solving problems for which compilers are not very useful, so there are probably other resources that are a better fit.

[0]: https://github.com/Highload-fun/platform/wiki

[1]: https://www.intel.com/content/www/us/en/content-details/6714...


👤 sim7c00
Low Level Programming by igor zhirkov even though its not really about assembler specifically. it has a good chapter on it and teaches good to apply knowledge of machine-code/assembly to an architecture/system (amd64 in this case), and then spends a lot of time to teach how to translate that upwards to higher languages rather than downward. teaches u to find out and research stuff yourself too. he's a good teacher.

know its not about llvm and jit etc. - but imho the basics is first this, and then moving up. otherwise it's confusing.


👤 oldmanludd
OpenSecurityTraining2 has some Assembly courses

https://p.ost2.fyi/courses


👤 mtklein
I'd suggest working incrementally from areas of your existing strength. Tweak whatever code base you are most familiar with, starting with a tiny change, and see how the assembly changes. I use objdump -d and git diff --no-index for this all the time.

👤 billsix
I've liked Jonathan Bartlett's books, his newest is "Learn to Program with Assembly"

👤 volkadav
If you're looking for introductory material, I'd highly recommend Computer Systems, A Programmer's Perspective by Bryant and O'Hallaron: https://csapp.cs.cmu.edu/ It sounds like the material you're after would mostly be in chapters two or three through five depending on where you'd want to start. The second edition is much cheaper used and follows broadly the same path, though it does have x86-32 in the main text with -64 as an appendix ("web aside"); third swaps that.

👤 maldev
I would highly recommend AMD's developer manual. It's a lot more written for actual reading rather than a pure tech manual with super thick language like Intel's is.

I would also recommend NASM's guide for syntax and such. https://www.nasm.us/xdoc/2.13.03rc1/html/nasmdoc0.html


👤 bombcar
The first thing you’ll learn is that a macro assembler is surprisingly high level; much of what you think of as C-style high level can be done by macros.

👤 jim_lawless
"x64 Assembly Language Step-by-Step: Programming with Linux" (4th edition) by Jeff Duntemann is a pretty good book.

👤 andrewstuart
> modern assembler

For long out of date assembler this YouTube channel: https://www.youtube.com/@ChibiAkumas

For modern assembler this YouTube channel: https://www.youtube.com/@WhatsACreel


👤 brianrhall
Assembly Programming and Computer Architecture for Software Engineers https://github.com/brianrhall/Assembly Also helpful is the compiler explorer https://godbolt.org/ Although a more modern way to do cross platform low level tasks is with compiler intrinsics. The book above introduces intrinsics, but Intel has a great intrinsics guide https://www.intel.com/content/www/us/en/docs/intrinsics-guid...

👤 alexdowad
Aside from what has already been suggested, you could consider reading selected chapters of Intel's programmer manual. I personally read through the whole thing once (well, skimmed some parts).

From my experience, Intel's x86 manual is better and easier to read than AMD's. It's a free download.


👤 Koshkin
A great way to learn assembler is to closely examine code generated by a compiler, e.g. on godbolt.org.

👤 JonChesterfield
There are two aspects to assembler. One is the target machine - learning what instructions, memory, performance characteristics you're dealing with.

The other is the assembler - what syntax it gives you, how it handles macros, whether it optimises, whether it does any semantic analysis. GNU AS is different to NASM is different to flat assembler.

I didn't get much out of reading compiler disassembly relative to handwritten assembly. I'd recommend trying to find some of the latter, might need to be maths libs or video codecs or similar. I'd be interested in recommendations here, the asm I learned from was proprietary.


👤 vmchale
I wrote a blog post on writing a JIT that can handle FFI calls: http://blog.vmchale.com/article/jit

If you want the full monty, I think you'll have to read the LLVM documentation on JIT linking: https://llvm.org/docs/JITLink.html

I haven't found any academic papers or tutorials on JIT linking, unfortunately.


👤 rerdavies
Go to the source: the Intel Software Developer Manuals.

https://www.intel.com/content/www/us/en/developer/articles/t...

You will want the first two volumes. For LLVM and JIT work you don't need the last two volumes.

Not kind, or gentle, but certainly definitive and authoritative.


👤 criddell
Lots of people are recommending x86 and I wonder if they are talking mostly about the x86 specifically or would that include x86-64? I’d really like to get better at working with crash dumps and since everything in my world is 64-bit, that’s what I’m seeing.

BTW, if anybody has recommendations for assembly in the context of crash dumps, I’d be very appreciative.


👤 kylecazar
Fond memories of ordering the then-free print volumes of the IA-32 reference manuals from Intel... and actually receiving them.

👤 vbezhenar
If you're interested in ARM 32 bit, I can recommend book "Raspberry Pi Assembly Language Programming: ARM Processor Coding"

It very thoroughly describes Cortex M0 assembly language and it also touches the concept of multiprocessor programming. And you just need two Raspi Picos (one to serve as programmer) which are very available.


👤 dragontamer
> as well as to write high performance low-level code

This is different. I would suggest "Intel® 64 and IA-32 Architectures Optimization Reference Manual", as well as https://www.agner.org/optimize/ .


👤 badrabbit
Azeria labs has a nice arm assembly tutorial, the lady behind it also has nice books on it that I highly recommend.

👤 Vosporos
The mario kart Wii retro-players have you covered with ARMv8: https://mariokartwii.com/armv8/

👤 pyinstallwoes
Build a Forth

👤 KingOfCoders
OT: Not a modern one, but "Z80 Assembly Language Subroutines" has been my favorite computer book for 40+ years.

👤 HarHarVeryFunny
As you probably appreciate, there's a lot of a difference between just being able to write assembler, and being able to write optimized assembler that will beat a modern optimizing compiler (else what's the point, other than fun, unless you are the one writing the compiler, which seems to be your interest).

One of the issues with modern processors (wasn't true back in the day with the old 8-bitters) is that the processor is so much faster than memory access that this needs to be taken into consideration when writing optimized code. Instruction timings (number of clock cycles) for memory access are going to vary a lot depending on where the data is being held - in cache or in main memory. Writing optimized code (high level as well as assembler) therefore becomes not just a matter of making the code itself as minimal and fast as possible, but also organizing the program's data access to operate out of cache as much as possible and minimize main memory access. The key is to be sensitive to the layout of your data in memory, and try to have your inner loops/code access nearby (same cache line) data rather than hopping about all over the place. e.g. If you have a 2-D array that's laid out in memory row by row (vs col by col), then you want to access it that way too (work on rows) to take advantage of cache.

I used to write a lot of 8-bit assembler back in the day (as well as more recently for some retro-computing fun), but never x86, so don't have any specific resources to share. Once you've learned the basics of the instruction set, a good point to start might be to take some simple functions and compile to assembler both with and without optimization enabled - and try coding the same function yourself in assembler to see if you can beat the compiler. Search for "x86 tricks" type of resources too - the things that other assembly programmers have learnt how to optimize use of the instruction set and write fast and compact code.

Note that cache considerations apply to code as well as data, so you want your code to be compact (fit as much of your inner loops into cache as possible), and also to branch as little as possible, for two reasons. First, you want to take advantage of cache by executing consecutive instructions as far as possible, and second branching kills the pipelining performance (you are throwing away work already done) of modern processors, even though they try to mitigate this with branch prediction.


👤 rramadass
Not specific to LLVM/JIT but for assembly checkout the books by Larry Pyeatt(ARM) and Daniel Kusswurm(x86).

👤 fuzztester
For those who want to do x86 assembly first, google Paul Carter assembly language.

It could be one option.


👤 201984
uops.info is a very useful website when you're starting to optimize your code. It shows you the throughput and latency of most x86 instructions as tested on a large range of microarchitectures.

👤 jmspring
Under no circumstances should knowledge of Assembler be needed to work with LLMs.

Many people that work in DS/ML can barely make it with Python.