Why the Linux Kernel doesn't have unit tests?

Question

Linux is present in a big percent of the devices on the planet, from smart phones to servers to IoT devices to (...). It's fair enough to assume that bugs and/or regressions in it can (and will) affect big portions of these devices. This makes me wonder why aren't there any unit tests in the kernel (and the different drivers in it; especially file system related). Or maybe there are and I just haven't found them?

sph · Accepted Answer

To add to all the excellent answers in this thread: unit tests are massively overrated.
It is good to know that your base64 encoding function is tested for all corner cases, but integration and behaviour tests for the external interface/API are more important than exercising an internal implementation detail.
What is the external interface of the kernel? What is its surface area? A kernel is so central and massive the only way to test its complete contract with the user (space) is... just to run stuff and see if it breaks.
TDD has some good ideas but for a while it had turned in a religion. While tests are great to have, a good and underrated integration testing system is just for someone to run your software. If no one complains, either no one is using it, or the software is doing its work. Do you really need tests for the read(2) syscall when Linux is running on a billion devices, and that syscall is called some 10^12 times per second globally?
https://www.youtube.com/watch?v=EZ05e7EMOLM

anaisbetts · Answer

Unit tests are poorly suited for kernel code. The vast majority of production failures in kernel code are due to race conditions and other timing-specific issues, which unit tests are extremely poor at reproducing in a way that mimics how the code is used in real life
Another huge source of issues is due to hardware doing something unexpected/not-to-spec - another thing that unit tests would very poorly verify given that, any unit test will simply reproduce how the developer thinks some piece of hardware works, rather than what it does in real life
Production kernels are usually better served by long-running stress tests, that try to reproduce real-world use-cases but inject randomness into what they are doing. And indeed, both NT and Linux kernels are extensively tested in this fashion

goto11 · Answer

There are unit tests for the Linux kernel and even a kernel-specific unit-test framework. See: https://kernel.org/doc/html/latest/dev-tools/kunit/Here is an article about the various ways Linux is tested: https://embeddedbits.org/how-is-the-linux-kernel-tested/ Unit tests are just a small part of the array of automated testing used.

st_goliath · Answer

Maybe not "typical" unit tests, but Linux does have test cases for stuff like the generic library code (data compression, encryption, ECC, ...) in the shape of special modules that, when built, run through their test cases during boot up.
It gets harder to test low-level driver code, but some subsystems have similar sets of tests, e.g. MTD has a set of tests for drivers & flash chips that they interact with. In 2016 I ported those tests to userspace as part of mtd-utils. The tests for the kernel crypto stuff I mentioned would also test hardware accelerators, if enabled.
Filesystems have the fstests project (formerly xfstests, as it was originally written when porting XFS to Linux), testing filesystem semantics and known bugs. There is something similar for the block layer (blktests). The downside being that those test suits take rather long to do a full run.
Those are just things that I can think of on top of my head, given the subsystems that I have interacted with in the past. Static analysis on the kernel code is also being done[1][2], there are also CI test farms[3][4], fuzzing farms[5], etc. As others have pointed out, there is a unit testing framework in the kernel[6], IIRC replacing an older framework and ad-hoc test code.
[1] https://www.kernel.org/doc/html/v4.15/dev-tools/coccinelle.h...
[2] https://scan.coverity.com/projects/linux
[3] https://cki-project.org/
[4] https://bottest.wiki.kernel.org/
[5] https://syzkaller.appspot.com/upstream
[6] https://kernel.org/doc/html/latest/dev-tools/kunit

Slartie · Answer

As huge parts of the kernel are effectively device driver code which interacts with hardware on a low level, classic unit testing techniques like mocking can't be applied - or they could be, by writing a mock for the device in question, but that's usually prohibitively expensive if you aim for a fidelity level at which such a mock would actually be of any worth and wouldn't just provide you the "good feeling" of having some unit test coverage while not really increasing the confidence in the tested code.This is especially true when we're talking about reverse-engineered device communication protocols, where you don't have a clue how the device actually works internally, and thus lack the basic means of constructing a mock that does more than just implement the exact same assumptions you've already based your driver code on (resulting in always-green tests that never find actual bugs). But also in cases where you have a protocol spec, the vast majority of ugly bugs in drivers usually originate from differences between that spec and the behavior of the device in the real world.

throwaway2037 · Answer

If you want to contribute unit tests, I believe that Linus would be open to the idea. For example, there are a bunch of internal APIs: linked lists come to mind. If you wrote unit tests for an internal API, he might accept it. I would start very small -- one .c file with no more than 20 unit tests -- to gauge if there is interest from the community. It seems like a nice gateway to get junior developers (university-level students) in the kernel development. Reading open source mailing lists, I have learned that the best way to "break-in" is a tiny code change that is well-tested. It is hard to reject this type of submission. Be unopinionated, but competant in your submission, and they will be open-minded.

elitepleb · Answer

https://kernel.org/doc/html/latest/dev-tools/kunit/

phtrivier · Answer

I just love threads that start with "why don't they do X ?", followed by dozens of comments explaining why "X is useless, they would never bother doing it, and you should not either", followed by "maybe you should try doing X, who knows", before someone mentions, "hum, well, actually they have been doing it for a few years now" - and that point _not_ being the end of the conversation.Fontenelle, golden tooth, etc..

monocasa · Answer

There are tests, they're mainly just out of tree, focused on integration rather than unit, and very decentralized. You'll get nastygrams on lkml if you break them.Here's one prominent example: https://github.com/linux-test-project/ltp

exabrial · Answer

I think the comment about race conditions is spot on.
A while, back on hacker news, there was an article about a company developing a database. Their entire testing methodology came down to producing a deterministic kernel and thread scheduler. This allowed them to simulate every possible permutation of their concurrent code in scientifically speaking reproducible manner.
Developing this test framework was actually the majority of what the company did. Testing the kernel would be a similar level of effort.

ape4 · Answer

Lots of info here https://stackoverflow.com/questions/3177338/how-is-the-linux...

throw_a_grenade · Answer

There's LTP, which tests individual syscalls:
- https://en.wikipedia.org/wiki/Linux_Test_Project
- https://github.com/linux-test-project/ltp
- https://linux-test-project.github.io/
- https://lists.linux.it/listinfo/ltp

cesarb · Answer

> This makes me wonder why aren't there any unit tests in the kernel (and the different drivers in it; especially file system related).Others have already commented on the testing situation for the kernel in general (historically, the tests were all in separate projects outside the kernel, and more recently, the kernel itself has a testing framework), but for filesystems in particular, there's an external project called "xfstests". The name might imply they're only for xfs (and they were originally made to test xfs in particular), but they're used for all filesystems (and also for the common filesystem-related code in the kernel).

emilengler · Answer

[BSD systems do](https://github.com/openbsd/src/tree/master/regress/sys)

mueddib · Answer

huh, unit test for low-level C codes? I suggest to learn kernel self test -> Linux kernel contains a "self test suite" and this tool testing coding at module compile time so these are intended tests to exercise individual code paths in kernel and tests are intended to be run after building, installing and booting a kernel.

amluto · Answer

Linux has unit tests for portions of the kernel. Look in the tools/selftests directory.x86 in particular has quite a few, although they run from userspace and exercise stable ABIs, so one might argue that they&rsquo;re really integration tests.

Kepouick · Answer

With plain logic application, writing good unit tests is much harder than writing good code. Probably by a magnitude of 10 or more. Think about it, you have to surround every path in code by a test case ! You have to create mocks everywhere and end up consuming pretty much the whole team time in writing and maintaining tests, whereas in less time through code reviews/rewrites you could have fixed several issues, improved readability, improved team knowledge, facilitate long term support, and so on...
In software real troubles comes from integration, not much unitary stuff (unless you praise Monkey coding).
And for Linux, it is not just a plain logic application but a kernel that runs on different hardware and has huge base of users and applications. Testing the thing as whole is the only thing that matters. Releasing alpha and beta software is a much more sound, rational and efficient approach. IT industry is (or at least was) organized to test before production.
I am certainly not against unit testing. It remains a wise approach for piece of software that need to be glued in stone forever or have super bounded in/out outcome (eg: bank transfers). But it is nothing more than a tool that you might use or not based on situations. Certainly never a one fit all solution !

oxff · Answer

Unit Tests are a colossal waste of time 80% of the time, and at least 80% of that time should be spends in thinking how to make your system debuggable.

teeray · Answer

So, if no tests, what is the feedback cycle like for kernel work? Write, compile, put it on some hardware, reboot, collect an oops, debug, repeat?

LaundroMat · Answer

I see a lot of headlines written in the form of : "Why [subject] doesn't/isn't [verb]?" instead of "Why doesn't/isn't [subject] [verb]?".
Is this an evolution of English of simply a sign the writer is not a native English speaker?
(I mean no judgement with this question)

_448 · Answer

Engineers working on the kernel and drivers have a dilemma. They cannot test the code for every conceivable platform because they do not have that many resources. They will test the code for the devices they have, and then ask for others to test on other similar platforms. This is because the code is near metal. The most the developers can do is to develop using the design-by-contract methodology. But that too goes out of the window when others start supporting more platforms and devices.So the only approach that has worked is solid code reviews. There was a very interesting email exchange between Linus and an engineer from Sun. Linus was rejecting his patch because there was a spelling mistake in a comment. That's how strict he is about code quality.

raverbashing · Answer

While uncle bob and other unit test dogmatists have "convinced" some cargo-cultists and inexperienced people that "code without tests" doesn't work (and of course they do some fine goalpoast moving while at it) it is a fact that it, in fact, it does work.

wizofaus · Answer

Presumably there are some sort of automated tests though. If those are working well enough to find bugs there's probably no need for "pure" unit tests (which I normally write because of performance and reliability issues that plague automated tests running at a higher level). It makes sense that kernel OS code would be more suitable to such higher level testing, given it typically doesn't have expensive/slow dependencies on external resources out of its control.

badrabbit · Answer

They do perform automated fuzzing which isn't the same thing but for most of its use cases like drivers, it might be enough so long as coverage is validated routinely.

deterministic · Answer

Unit tests are a waste of time. System level tests are awesome. I am maintaining a very large C++ library used in production by large corporations around the world. And I haven&rsquo;t had a production bug, not even one, for more than 6 years. The reason why is 9000+ system level tests. A combination of 1500+ hand written tests and 7500+ auto generated regression tests.

strangattractor · Answer

I have spent more time fixing bugs in unit test than fixing bugs they exposed. The one exception is in untyped scripting languages. Test are necessary to run every piece of code to at least attempt to insure a type or function signature has not changed in a way that completely breaks the code.

matttproud · Answer

Please forgive me for a potentially dumb question:Is there a reason something like the scheduler, which should be mostly algorithmic, should lack unit tests? I can see an argument against some aspects of device driver testing, but the scheduler is in a different class of code, right?

dannymi · Answer

There are automated (integration) test rigs that run your changes that watch Linux kernel branches. If you send a patch to a Linux kernel maintainer then it will make it into their branch and then you (the author of the patch) will get emails with the results of the automated tests.

Waterluvian · Answer

My intuition tells me the minimal testing + community testing is not sufficient. But I&rsquo;m curious what reality tells us. How often has this been a problem? How often are there significant bugs that testing would have caught?

revskill · Answer

Unit testing is a way to debug logic error, not "hardware" error.It's like quarantine when you know a part of codebase is infected by "bugs"

dboreham · Answer

Ceci n'est pas un test.

alexfromapex · Answer

The real reason is there's just too many units

fedeb95 · Answer

Linux test project exists

7e · Answer

It&rsquo;s because unit testing is hard and most OSS devs are wankers.

nurettin · Answer

Linux didn't have unit tests.For the most part we had Linus eyeballing every line of code before merging it. And if you wrote an extra if or did a boo boo using the wrong enum flag or you overran your buffer, you were flogged, berated and pitied in front of an international audience of engineers.I don't know how things go now in the post-sensitivity world.

logicallee · Answer

My opinion:Linux Kernel coders are real programmers. Do you think Mel wrote unit tests?[1]Real programmers are discrete mathematicians in their head. With pointers.Their code isn't perfect.By analogy, when Andrew Wiles published his some 109 page proof of Fermat's Last Theorem it had bugs. [2]The mathematics community tested it and he eventually corrected it. The Linux Kernel is like that.There are no unit tests for a^n + b^n = c^n because no integers above n=2 satisfy them.You can't unit test your way to secure, correct code in the kernel either. Only a community of testers and verifiers can do that."given enough eyeballs, all bugs are shallow"[3][1] http://www.catb.org/jargon/html/story-of-mel.html[2] https://en.wikipedia.org/wiki/Wiles%27s_proof_of_Fermat%27s_...[3] https://en.wikipedia.org/wiki/Linus%27s_law

Why the Linux Kernel doesn't have unit tests?

https://kernel.org/doc/html/latest/dev-tools/kunit/

There are tests, they're mainly just out of tree, focused on integration rather than unit, and very decentralized. You'll get nastygrams on lkml if you break them.
Here's one prominent example: https://github.com/linux-test-project/ltp

Lots of info here https://stackoverflow.com/questions/3177338/how-is-the-linux...

There's LTP, which tests individual syscalls:
- https://en.wikipedia.org/wiki/Linux_Test_Project
- https://github.com/linux-test-project/ltp
- https://linux-test-project.github.io/
- https://lists.linux.it/listinfo/ltp

[BSD systems do](https://github.com/openbsd/src/tree/master/regress/sys)

Linux has unit tests for portions of the kernel. Look in the tools/selftests directory.
x86 in particular has quite a few, although they run from userspace and exercise stable ABIs, so one might argue that they’re really integration tests.

Unit Tests are a colossal waste of time 80% of the time, and at least 80% of that time should be spends in thinking how to make your system debuggable.

So, if no tests, what is the feedback cycle like for kernel work? Write, compile, put it on some hardware, reboot, collect an oops, debug, repeat?

I see a lot of headlines written in the form of : "Why [subject] doesn't/isn't [verb]?" instead of "Why doesn't/isn't [subject] [verb]?".
Is this an evolution of English of simply a sign the writer is not a native English speaker?
(I mean no judgement with this question)

While uncle bob and other unit test dogmatists have "convinced" some cargo-cultists and inexperienced people that "code without tests" doesn't work (and of course they do some fine goalpoast moving while at it) it is a fact that it, in fact, it does work.

They do perform automated fuzzing which isn't the same thing but for most of its use cases like drivers, it might be enough so long as coverage is validated routinely.

I have spent more time fixing bugs in unit test than fixing bugs they exposed. The one exception is in untyped scripting languages. Test are necessary to run every piece of code to at least attempt to insure a type or function signature has not changed in a way that completely breaks the code.

Please forgive me for a potentially dumb question:
Is there a reason something like the scheduler, which should be mostly algorithmic, should lack unit tests? I can see an argument against some aspects of device driver testing, but the scheduler is in a different class of code, right?

There are automated (integration) test rigs that run your changes that watch Linux kernel branches. If you send a patch to a Linux kernel maintainer then it will make it into their branch and then you (the author of the patch) will get emails with the results of the automated tests.

My intuition tells me the minimal testing + community testing is not sufficient. But I’m curious what reality tells us. How often has this been a problem? How often are there significant bugs that testing would have caught?

Unit testing is a way to debug logic error, not "hardware" error.
It's like quarantine when you know a part of codebase is infected by "bugs"

Ceci n'est pas un test.

The real reason is there's just too many units

Linux test project exists

It’s because unit testing is hard and most OSS devs are wankers.

Why the Linux Kernel doesn't have unit tests?

https://kernel.org/doc/html/latest/dev-tools/kunit/

There are tests, they're mainly just out of tree, focused on integration rather than unit, and very decentralized. You'll get nastygrams on lkml if you break them.Here's one prominent example: https://github.com/linux-test-project/ltp

Lots of info here https://stackoverflow.com/questions/3177338/how-is-the-linux...

There's LTP, which tests individual syscalls:- https://en.wikipedia.org/wiki/Linux_Test_Project- https://github.com/linux-test-project/ltp- https://linux-test-project.github.io/- https://lists.linux.it/listinfo/ltp

[BSD systems do](https://github.com/openbsd/src/tree/master/regress/sys)

Linux has unit tests for portions of the kernel. Look in the tools/selftests directory.x86 in particular has quite a few, although they run from userspace and exercise stable ABIs, so one might argue that they’re really integration tests.

Unit Tests are a colossal waste of time 80% of the time, and at least 80% of that time should be spends in thinking how to make your system debuggable.

So, if no tests, what is the feedback cycle like for kernel work? Write, compile, put it on some hardware, reboot, collect an oops, debug, repeat?

I see a lot of headlines written in the form of : "Why [subject] doesn't/isn't [verb]?" instead of "Why doesn't/isn't [subject] [verb]?".Is this an evolution of English of simply a sign the writer is not a native English speaker?(I mean no judgement with this question)

While uncle bob and other unit test dogmatists have "convinced" some cargo-cultists and inexperienced people that "code without tests" doesn't work (and of course they do some fine goalpoast moving while at it) it is a fact that it, in fact, it does work.

They do perform automated fuzzing which isn't the same thing but for most of its use cases like drivers, it might be enough so long as coverage is validated routinely.

I have spent more time fixing bugs in unit test than fixing bugs they exposed. The one exception is in untyped scripting languages. Test are necessary to run every piece of code to at least attempt to insure a type or function signature has not changed in a way that completely breaks the code.

Please forgive me for a potentially dumb question:Is there a reason something like the scheduler, which should be mostly algorithmic, should lack unit tests? I can see an argument against some aspects of device driver testing, but the scheduler is in a different class of code, right?

There are automated (integration) test rigs that run your changes that watch Linux kernel branches. If you send a patch to a Linux kernel maintainer then it will make it into their branch and then you (the author of the patch) will get emails with the results of the automated tests.

My intuition tells me the minimal testing + community testing is not sufficient. But I’m curious what reality tells us. How often has this been a problem? How often are there significant bugs that testing would have caught?

Unit testing is a way to debug logic error, not "hardware" error.It's like quarantine when you know a part of codebase is infected by "bugs"

Ceci n'est pas un test.

The real reason is there's just too many units

Linux test project exists

It’s because unit testing is hard and most OSS devs are wankers.

There are tests, they're mainly just out of tree, focused on integration rather than unit, and very decentralized. You'll get nastygrams on lkml if you break them.
Here's one prominent example: https://github.com/linux-test-project/ltp

There's LTP, which tests individual syscalls:
- https://en.wikipedia.org/wiki/Linux_Test_Project
- https://github.com/linux-test-project/ltp
- https://linux-test-project.github.io/
- https://lists.linux.it/listinfo/ltp

Linux has unit tests for portions of the kernel. Look in the tools/selftests directory.
x86 in particular has quite a few, although they run from userspace and exercise stable ABIs, so one might argue that they’re really integration tests.

I see a lot of headlines written in the form of : "Why [subject] doesn't/isn't [verb]?" instead of "Why doesn't/isn't [subject] [verb]?".
Is this an evolution of English of simply a sign the writer is not a native English speaker?
(I mean no judgement with this question)

Please forgive me for a potentially dumb question:
Is there a reason something like the scheduler, which should be mostly algorithmic, should lack unit tests? I can see an argument against some aspects of device driver testing, but the scheduler is in a different class of code, right?

Unit testing is a way to debug logic error, not "hardware" error.
It's like quarantine when you know a part of codebase is infected by "bugs"