It is good to know that your base64 encoding function is tested for all corner cases, but integration and behaviour tests for the external interface/API are more important than exercising an internal implementation detail.
What is the external interface of the kernel? What is its surface area? A kernel is so central and massive the only way to test its complete contract with the user (space) is... just to run stuff and see if it breaks.
TDD has some good ideas but for a while it had turned in a religion. While tests are great to have, a good and underrated integration testing system is just for someone to run your software. If no one complains, either no one is using it, or the software is doing its work. Do you really need tests for the read(2) syscall when Linux is running on a billion devices, and that syscall is called some 10^12 times per second globally?
Another huge source of issues is due to hardware doing something unexpected/not-to-spec - another thing that unit tests would very poorly verify given that, any unit test will simply reproduce how the developer thinks some piece of hardware works, rather than what it does in real life
Production kernels are usually better served by long-running stress tests, that try to reproduce real-world use-cases but inject randomness into what they are doing. And indeed, both NT and Linux kernels are extensively tested in this fashion
Here is an article about the various ways Linux is tested: https://embeddedbits.org/how-is-the-linux-kernel-tested/ Unit tests are just a small part of the array of automated testing used.
It gets harder to test low-level driver code, but some subsystems have similar sets of tests, e.g. MTD has a set of tests for drivers & flash chips that they interact with. In 2016 I ported those tests to userspace as part of mtd-utils. The tests for the kernel crypto stuff I mentioned would also test hardware accelerators, if enabled.
Filesystems have the fstests project (formerly xfstests, as it was originally written when porting XFS to Linux), testing filesystem semantics and known bugs. There is something similar for the block layer (blktests). The downside being that those test suits take rather long to do a full run.
Those are just things that I can think of on top of my head, given the subsystems that I have interacted with in the past. Static analysis on the kernel code is also being done[1][2], there are also CI test farms[3][4], fuzzing farms[5], etc. As others have pointed out, there is a unit testing framework in the kernel[6], IIRC replacing an older framework and ad-hoc test code.
[1] https://www.kernel.org/doc/html/v4.15/dev-tools/coccinelle.h...
[2] https://scan.coverity.com/projects/linux
[4] https://bottest.wiki.kernel.org/
This is especially true when we're talking about reverse-engineered device communication protocols, where you don't have a clue how the device actually works internally, and thus lack the basic means of constructing a mock that does more than just implement the exact same assumptions you've already based your driver code on (resulting in always-green tests that never find actual bugs). But also in cases where you have a protocol spec, the vast majority of ugly bugs in drivers usually originate from differences between that spec and the behavior of the device in the real world.
Fontenelle, golden tooth, etc..
Here's one prominent example: https://github.com/linux-test-project/ltp
A while, back on hacker news, there was an article about a company developing a database. Their entire testing methodology came down to producing a deterministic kernel and thread scheduler. This allowed them to simulate every possible permutation of their concurrent code in scientifically speaking reproducible manner.
Developing this test framework was actually the majority of what the company did. Testing the kernel would be a similar level of effort.
- https://en.wikipedia.org/wiki/Linux_Test_Project
- https://github.com/linux-test-project/ltp
Others have already commented on the testing situation for the kernel in general (historically, the tests were all in separate projects outside the kernel, and more recently, the kernel itself has a testing framework), but for filesystems in particular, there's an external project called "xfstests". The name might imply they're only for xfs (and they were originally made to test xfs in particular), but they're used for all filesystems (and also for the common filesystem-related code in the kernel).
x86 in particular has quite a few, although they run from userspace and exercise stable ABIs, so one might argue that they’re really integration tests.
In software real troubles comes from integration, not much unitary stuff (unless you praise Monkey coding).
And for Linux, it is not just a plain logic application but a kernel that runs on different hardware and has huge base of users and applications. Testing the thing as whole is the only thing that matters. Releasing alpha and beta software is a much more sound, rational and efficient approach. IT industry is (or at least was) organized to test before production.
I am certainly not against unit testing. It remains a wise approach for piece of software that need to be glued in stone forever or have super bounded in/out outcome (eg: bank transfers). But it is nothing more than a tool that you might use or not based on situations. Certainly never a one fit all solution !
Is this an evolution of English of simply a sign the writer is not a native English speaker?
(I mean no judgement with this question)
So the only approach that has worked is solid code reviews. There was a very interesting email exchange between Linus and an engineer from Sun. Linus was rejecting his patch because there was a spelling mistake in a comment. That's how strict he is about code quality.
Is there a reason something like the scheduler, which should be mostly algorithmic, should lack unit tests? I can see an argument against some aspects of device driver testing, but the scheduler is in a different class of code, right?
It's like quarantine when you know a part of codebase is infected by "bugs"
For the most part we had Linus eyeballing every line of code before merging it. And if you wrote an extra if or did a boo boo using the wrong enum flag or you overran your buffer, you were flogged, berated and pitied in front of an international audience of engineers.
I don't know how things go now in the post-sensitivity world.
Linux Kernel coders are real programmers. Do you think Mel wrote unit tests?[1]
Real programmers are discrete mathematicians in their head. With pointers.
Their code isn't perfect.
By analogy, when Andrew Wiles published his some 109 page proof of Fermat's Last Theorem it had bugs. [2]
The mathematics community tested it and he eventually corrected it. The Linux Kernel is like that.
There are no unit tests for a^n + b^n = c^n because no integers above n=2 satisfy them.
You can't unit test your way to secure, correct code in the kernel either. Only a community of testers and verifiers can do that.
"given enough eyeballs, all bugs are shallow"[3]
[1] http://www.catb.org/jargon/html/story-of-mel.html
[2] https://en.wikipedia.org/wiki/Wiles%27s_proof_of_Fermat%27s_...