HACKER Q&A
📣 warrenm

What is it about compiling that is so time-consuming?


20 years ago, the "cool kids" (aka the egocentric dorks who wanted to be thought of as "special") would brag about installing Gentoo from source, or something similar, and how the kernel had only taken 3 days to compile now that they had a gig of RAM instead of 384 megs

Even today, a not-insignificant number of projects/tools/etc are distributed as source, and you have to build it locally ... sometimes taking hours

Other tools (eg Apache) are distributed both as binaries or source, and you can run the "out of the box" edition, or 'tweak' it a little compiling yourself...but again - it takes forever to get up and running (and heaven help you when an update comes out)

What is it about compilation that is so time-consuming?

Fundamentally, it's a [relatively-simple] translation from one language to another - with most compiled languages having been around for a long time

Surely compiling locally shouldn't be so astronomically time-consuming...should it?


  👤 bjourne Accepted Answer ✓
It's time consuming because the compiler tries to optimize the code. A compiler that doesn't try to optimize can be amazingly fast like Python's compiler or like how Go's compiler used to be. Neither attempted many optimizations.

Compiler optimizations are time consuming because they are trying to find solutions to NP-hard problems. Since solving NP-hard problems optimally is intractable compilers use heuristics. So the more optimization the user asks for the more time consuming heuristics are executed. One such problem is register allocation. Your CPU has a finite number of registers that should be used for an almost infinite number of variables. Figuring out a good register allocation is very difficult.

It also gets more difficult because modern compilers are optimizing larger contexts. Rust, C++, and other languages with functional features rely heavily on function inlining. So a function you write that is five lines may internally be expanded by the compiler to hundreds of lines.


👤 mike_hearn
There are fast compilers. Go is an example. Delphi was another very fast compiler.

AOT compilers that are slow, are slow because they spend a lot of time optimizing code that isn't particularly important (but they have no way to know that).

A good way to see this is by taking a large-ish Java program that does the same thing as a large-ish C++ or Rust program, and comparing the compile times. The Java program will compile much faster, but runtime performance will be roughly comparable post-warmup (unless the program is very sensitive to memory locality, as Java doesn't currently have value types). The reason is that the Java -> bytecode compilation is very simple and involves almost no optimization work, and then the rest of the compile is done at runtime, but it's all profile guided so only the parts of the app that actually benefit from optimization get it.


👤 madethissoicou
one issue in c/c++ is that each compilation unit (i.e. cpp file) includes tons of headers which in turn include tons more headers. so it's like an O(n^2) kind of growth where n is source size. obvs you can do better if you're careful about dependencies

👤 JCWasmx86
You have more complex programming languages like Rust/Swift that take comparably ages to compile compared with C due e.g. all the extra code that is generated and that has to be optimized by LLVM (This one is quite slow AFAIK, too).

C++ has its template instantiations with the same problem.

Another aspect is how much more the compilers optimize, compared to the "early" days, modern compilers can often beat human programmers at optimizing assembly in a reasonable time and can do a lot more complex optimizations. In addition to that, you can couple it e.g. with linker optimizations like LTO/BOLT and you have even longer build times.

I think another factor is the more and more dominant static linking model (Rust/Go/Swift). If I compile e.g. my program with the swift stdlib statically linked, it takes around 15 seconds for an incremental build, while using a dynamically linked stdlib it takes around 6 seconds. Sure it is just one benchmark, but static linking nearly everything won't have no impact.


👤 WorldMaker
A simple, fundamental reason that hasn't shifted in 20 years and in some ways has only grown worse as our collective hard drives have grown increasingly larger versus RAM size: File I/O. Compilers eat a lot of files and produce a number of more files, sometimes with several passes of intermediate files along the way.

You could have the most efficient possible compiler and it still has to read N files and output X binary objects back to disk. Those will always be bottlenecks that are sometimes extremely visible in wall clock time (even on the fastest SSDs sometimes, once you account also for OS paging and caching and background processing like anti-virus tools and other security audits).


👤 jvalencia
It's about efficiency. Obviously something like python is real time. Even typescript to javascript can be fairly quick (a la esbuild). However, once we start talking about moving bits around using compiler optimizations, things slow down a lot. This is because there are a lot of steps to going through every bit of assembly to reorganize the code into a machine optimized state. Some quick Google searches get you gory details: https://gcc.gnu.org/onlinedocs/gccint/Tree-SSA.html#Tree-SSA

👤 robertelder
I spent a fair amount of time working on building my own from-scratch C compiler that I no longer work on anymore:

https://recc.robertelder.org/

One of the reasons that I stopped working on it was because of how slow it became, so I might be able to contribute to answering your question.

Initially, when the compiler was simpler, it was actually much faster. I was able to do some meaningful proof of concept demos with it like compiling a small microkernel, and compiling most of its own source code. Of course, the natural thing to do is to make it so that could cross-compile itself and run in the browser, and that's where it became terribly slow, which required more code to optimize, and the new code that was added to make it faster in the long term made it much slower in the short term.

To start with, if you think of a simple piece of code like this:

    if ( 1 ) { putc('a'); }
This is only a 23 byte character program, so why should it be slow to compile? Well, the first stage of parsing this program involves tokenization. In this short program, I count 16 different 'tokens' (including the whitespace). If you want to have even the simplest data structure to describe one of your 'tokens', that only contains a single pointer to an offset in the program, then you will need to consume 16 pointers just for the tokens. On a 64 bit machine, you'll have 8 byte pointers, and 16 * 8 = 128 bytes, just for the pointers into the byte array of the program! And we haven't even started talking about the memory overhead of all the other things you'll need to describe about these tokens in your token object.

So, now we already have a memory overhead that is more than 5 times as big as the program, but we also have to build the parse tree, control flow graphs, linker objects etc. and you also have to pull in a mess of header files, bloated libraries etc. If you're wasteful with memory in the compiler, you can easily run out of memory from compiling a few megabytes of source code. Being more intelligent with memory management requires copying memory around a lot, which also adds to the latency.

So, now you need to think about optimizing your memory use, and do 'smarter' things that trade memory usage for CPU. Plus, you're likely to also start needing free/delete a lot from heap memory which is a system call and therefore slower than a call within your program. By the time you implement all this 'optimization', you compiler has become an incredibly complicated and bloated system that requires even more code to optimize all the opportunities for improvement.


👤 dangets
A major contributor can be memory access patterns. Traversing trees and pointers not adjacent in memory can result in a noticeable performance drag. Andrew Kelley gave an interesting talk about performance wins in the Zig compiler here https://media.handmade-seattle.com/practical-data-oriented-d... , but the same idea is broadly applicable.

👤 malux85
Most of the time you don’t need to compile yourself - which is why OS package managers exist.

But sometimes it’s advantageous - E.g. it’s possible for a compiler to take advantage of special CPU instructions that that are present on your CPU but cannot be guaranteed are present everywhere - resulting in a faster binary. In some special but relatively rare edge cases (e.g. numerical computing / simulations) these speed ups can be significant.

But there’s heaps of room for improvement, E.g. most compilations have steps that are embarrassingly parallel - so make has an option “-j” that allows you to use multiple CPU cores - this significantly speeds up the compilation of numpy, the trouble is, it also increases memory consumption quite a bit, so make cannot change its default to use all cores as it would OOM a lot of existing build systems.

What what I have seen the vast majority (like > 90%) of automated build systems don’t take advantage of multi core compiles, so that’s why they are dreadfully slow by default. Pythons subsystems are guilty of this, which is why building matplotlib, numpy, scipy, etc from source is very very slow


👤 syntheweave
Compilation is, at its core, an automation for quality. That is, it's doing work that is intended to polish and refine its source input to some end.

If you don't need the compiler, you can write machine code in hexadecimal directly, like the old days - that's really "coding" a program. But as a practical matter, we do need it - we want more organized methods of coding with more static checks. And then it's a question of, well, how much? Can we automate even more of it?

And we don't know those answers but we keep pushing the boundary a little farther as hardware progresses, and we also have a gradual creeping up of source code size as more and more features work their way down the stack towards being shared, reused code, so compilers always find ways to be slower. Work in an older compiler on newer hardware and it'll feel fast.


👤 JonChesterfield
The languages are usually quite different to one another, the user expects some extensive semantic checking with error messages, then the majority of the time should be spent in code optimisation so that the resulting program runs faster.

Some languages (I suppose language pairs from source to target) compile dramatically faster than others.


👤 tabtab
At our org compiler about 2x slower because the anti-virus software must scan every changed file before it can be referenced (read). It's a domain that needs good security, but the "scan tax" is high when you factor in human (non) labor costs. It's probably overkill, but beyond my power to change.

👤 boppo1
On this topic: I'm building a computer for the first time in many years, mostly for writing C & C++ code. But do I optimize for RAM or CPU? Is multithreading helpful in compilation?