I've been offered a position at Microsoft to do kernel development work. This would be a big transition to me, coming from a services background (backend only).
The job's main draw to me is doing low-level work. I did some in my very first job, but for the past 10+ years, due to a number of circumstances, I've been in the services world. I really liked being a C programmer and I've kept an eye on things over the years, and did some hobby projects (on x86 and some embedded stuff as well).
There's a lot about my current job that I treasure, despite the work itself not being interesting to me about 99% of the time. It's a remote job, the work-life balance is stellar, and I get 25 days of vacation a year (this is in the US), which allows me to spend a lot of quality time with my wife.
However, I'm considering leaving because I've been having significant motivation and performance[0] issues for the last two years. Through a lot of soul searching and even help from a therapist, I've identified that the source of my issues is the nature of the work itself. Building services is just something that doesn't give me a sense of accomplishment, and I'm not attracted to the stuff at all. Some issues I've identified are:
1. Infrastructure complexity, especially since moving to Kubernetes. I refuse to touch it at this point.
2. Debugging exclusively via metrics and logs, since I can't just attach a debugger to a running server.
3. Designing systems in general. Some people love the challenge of distributed transactions, eventual consistency and all that jazz, but it just rubs my brain the wrong way. I'm not interested at all in that problem space [1].
4. The insane amount of work required to stand up even the smallest microservice: infrastructure provisioning, certificates, security reviews, GDPR compliance, etc.
5. Anything I build will end up paging some poor soul at 3am some day when something is down or under heavy traffic.
So, what I'm wondering is: what are the things that would make me say "ugh" on the day-to-day as a kernel developer? Is there a chance I'll be happier, or would I just be trading one miserable set of problems for other equally miserable problems?
I tried asking that to every person who interviewed me, but I only got somewhat vague answers like "the build can take a long time depending on what you're doing", etc. Someone complained about windbg.
[0] Even though my reviews have been good, I know deep inside I'm not doing even 10% of the good work I could do before.
[1] Ironically, I've acquired a ton of knowledge about it and I'm one of the "go to" people within my org.
I would say James Mickens sums things up nicely in "The Night Watch[0]." For example, you mention debugging with logs and metrics -- this snippet came to mind:
“Yeah, that sounds bad. Have you checked the log files for
errors?” I said, “Indeed, I would do that if I hadn’t broken every
component that a logging system needs to log data. I have a
network file system, and I have broken the network, and I have
broken the file system, and my machines crash when I make
eye contact with them. I HAVE NO TOOLS BECAUSE I’VE
DESTROYED MY TOOLS WITH MY TOOLS. My only logging
option is to hire monks to transcribe the subjective experience
of watching my machines die as I weep tears of blood.”
Mind you, I absolutely _love_ working on low-level stuff, and I wouldn't trade the time I get to spend actually doing that for anything. That said, the complexity of modern operating systems, CPU architectures, interconnects, and peripherals creates opportunities for frustration and confusion that honor no bounds of reasonability or decency.[0]: https://www.usenix.org/system/files/1311_05-08_mickens.pdf
Kernel code is amazing, especially the parts written by the early members like DaveC, MarkZ and others.
For me the biggest part was working with a group of extremely smart people who were very nearly the best programmers in the world.
I really miss that outside of Microsoft. I would imagine that you can get the same experience if you worked with the Linux kernel dev team or some of the other few places in the world like FAMGA where you can work.
My suggestion is to go for it. After leaving MS I found peace by working with the open source community and open source software.
Hope that helps, happy to discuss further.
> 2. Debugging exclusively via metrics and logs, since I can't just attach a debugger to a running server.
You often can't do this in kernel/OS development work either. Serial printf logging is often required. It can be a real "my tools to debug my tools are broken" slog.
> 3. Designing systems in general. Some people love the challenge of distributed transactions, eventual consistency and all that jazz, but it just rubs my brain the wrong way. I'm not interested at all in that problem space [1].
I'm not sure I understand this issue. I mean, kernels have subsystems for doing stuff; you might need to design one someday? But it won't be dull-as-dishwater web technology stacks, it'll be you writing data structures directly in C/C++ or Rust if MS goes there.
1. There will be a lot of infrastructure complexity in the kernel, just prepare yourself for that. Even worse bugs! You'll be fixing a lot of bugs, or looking at a lot of bugs, and most of these bugs are from other teams who are interacting with your component! Just order a copy of Windows Internals and get yourself familiar with how thing work.
2. Old ass engineering systems. Just as the interviewers said you will spend a lot of your time waiting for Windows builds.
Good stuff:
1. Work life balance is amazing actually. Most of the time there's very little pressure for you to get work done, as long as you're doing something no one really bothers you.
2. Since you said you hate designing systems, good news, everything has been designed for you! Your, job will mostly be implementing new features for a component.
3. Windbg is actually great and I will die on this hill. You might have to print some debug logs but you won't be looking at metrics because there's whole teams dedicated to doing that stuff. There's also tools which quickly spin up VMs for you to do some live kernel debugging.
Meh stuff:
1. The pay, could be better. That might change soon though.
I think you are coming at this from the wrong perspective. Rather than thinking about how to avoid work you DON'T like, think about what you DO like and then decide if the new job would offer more or less of that.
Personally, I've found that every five years I end up sick of working on the same kind of problem and I have to go work on something completely different. Maybe that's where you're at.
Most importantly, MAKE SURE YOU KNOW WHAT THE JOB IS! Microsoft people don't try to be dishonest, but there can be misunderstandings between you and your future coworkers about your role, and if you take a job that turns out to be different from what you expected, you will be unhappy.
If you haven't already, you should ask the hiring manager more about what the team does. Try to get enough specifics that you might not know everything the manager refers to, but can easily Google what you don't know: "For example, in Windows 10 version 2004, we shipped the API and implementation for the Windows hypervisor feature that lets third-party VM host software like VirtualBox force their VM guests' virtualized RAM to be paged into the host machine's physical RAM all the time." (Not an actual feature, at least as far as I know.) At this level of detail, you'll be able to judge whether the work is really what you think it is.
Talk to your other interviewers to learn more about the work and the team, if they gave you their contact info or otherwise seemed inclined to hear from you. 3 out of 4 of them are likely going to be your peers, and the 4th is either the hiring manager, another mid-level to senior leader, or a team architect - all will be at least close enough to your team that they won't give you vague generalities.
The concern for me would be the fact that windows is no longer a fully offline experience running on people's desks. A large part of the kernel team is working on features that only benefit Azure and you may get some of the same services exposure there as you claim to be so burnt out on.
Asking to discuss the role more with the team to get a better sense of what they're doing, how they're doing it, and who their customers are seems like the path forward IMO.
In general, go for it. The level of understanding you obtain by working on the kernel about how things -really- work will make you a better engineer even if it turns out not to be for you.
Things I would caution you about based on the downsides of your current job:
1. Kernels make k8s look simple. Not just simple, childs-play.
2. Debugging experience in kernel-land can vary widely depending on what layer of the stack you are working on and that nature of the bugs. Highly concurrent pieces of any kernel are a nightmare to debug because such bugs are normally race conditions and timings are incredibly sensitive when you get this close to the metal.
3. Whilst you won't have to worry about this initially as you will probably need a few years experience at this level before you design new kernel subsystems I definitely wouldn't consider architecture at this level less complicated than distributed systems. This is because computers these days -are- distributed system. NUMA essentially means you have all the same problems. You do have much more convenient tools for solving them though (at the cost of performance) like HW coherency, etc.
4. Ok yeah, you shouldn't need to worry about this one. You won't be standing up new build systems or anything.
5. Well.. this is the rough part of kernel bugs. When you fk something up you potentially fk over everyone, usually in a very subtle, hard to diagnose and even harder to workaround way.
So yeah, go do it but don't do it because you think you will be getting away from those things because you aren't really. Do it because you think it will be enriching/fun/whatever you want more of in your life.
The other great part is the sheer amount of engineering that will go into every feature you will own. I say “engineering” because it’ll be different from any services work you have done.
Tools that are at your disposal as part of Windows Core team is top notch.
Good luck! You will not regret this.
If you can read and enjoy these books, consider it:
https://docs.microsoft.com/en-us/sysinternals/resources/wind...
The ugh!: You will be a tiny cog in a big machine, spending very little time writing code, mostly banging your head over someone else's bugs. And M$FT still has a certain stink to it these days and that may cling to you.
The windows org has a reputation of being dysfunctional although I never personally observed this, and personally think even if it were true it would be in the higher level parts of the stack.
Kernel dev was my original dream but I fell in love with Excel during my internship. I definitely think you should go for it.
It leads to code being Complicated, and a lot of Kobayashi Maru situations where there is no good solution to an engineering problem, just a bunch of bad compromises. Depending on your area, you might run into this all the time, or you might not.
You said you were looking for things that you might not like about kernel development, so here’s a few:
- Distributed systems. Everything is distributed these days with NUMA, cache coherency, PCI transactions, etc. You’ll have to know how to use the right kind of atomic memory ordering, debug lockless algorithms, and know how the OS scheduler works at a deep level. If this doesn’t excite you, the job might not be a good fit.
- Working with physical hardware. Although in most cases you can get away with a VM, you will almost certainly have to use an actual device for some of your work. The hardware you use might not be 100% functional or even correct- I’ve had to debug kernel issues that ultimately were hardware errata. At Apple it’s easy to get in contact with the silicon design teams, but I’m not sure about Microsoft and Intel/Qualcomm. Working with physical hardware also limits your ability to work remotely easily. Lugging around a bunch of laptops, phones, and tablets in a carry-on suitcase is no fun, especially when you have to take them all out at TSA checkpoints.
- Lack of user-visible impact. Kernel dev is vitally important but it doesn’t get much visibility, except when something goes wrong. New kernel features hardly make the headlines. It can be a little annoying to see your colleagues in app/web dev get recognized for the features they worked on at WWDC or Microsoft Build, while you’re toiling away working on kernel features that very few people care about.
- Debugging. I actually think debugging kernels is not too difficult if you have the right set of tools. At Apple we use lldb and hardware debuggers (JTAG/SWD), and have the ability to take a core dump of the kernel after a panic to analyze later. But since kernel dev is at the core of the operating system, you’ll have to learn to debug other parts of the stack too. For example, you might make a kernel change that breaks the file explorer but only when you visit a specific directory. So you’ll need to know how to debug both the user space process to know what’s going wrong, and the kernel to know why that is happening.
Although there are some negative aspects of the job, there are many more positives. I really like my job and can see myself staying at Apple for many years. I’m constantly learning and working with some of the smartest people I have ever met.
Give the Microsoft job a shot; you can always leave if it’s not a good fit but you will have left a better engineer than when you started.
In my experience the kernel devs are the best of the best developers. But know that working on the windows kernel will be complicated and require a lot of domain knowledge. Expect to learn about it for _years_ to come.
> 1. Infrastructure complexity - Sure, but there would be a LOT of complexity in kernel dev
> 2. Debugging - :) MS kernel is ancient and must be full of cruft. Systems dev is notorious for all sorts of weird timing, coordination bugs.
> 3. Designing systems - You'd have transactions, concurrency, race condition kind of problems which imo tickle the same part of the brain as distributed txns, eventual consistency etc
That said, if you like it, you like it. Can't know without trying. If I were you, and if I had the opportunity - I'd go for a newer variant of this which could look like Apple's M1 team, Tesla's systems teams etc. simply to have a fresh slate to build on.
You might even be able to return to your old company if you hate it. Tell them directly "hey I need to give this a shot but I'd love to check-in in a year".
Curious how you get an offer such as this? I've thought about changing job roles, but I really suck at leet coding so I've never really bothered. I figured that as a 40 year old male, no one would hire me for a role unless I was already experienced. Is that not the case?
The main thing I'd consider in this is the work/life balance. Will it allow the same stellar level of that? People vary in their priorities. For some it's interesting/fun problems to solve, for some it's impact, but for me it's work/life balance, presuming of course that the work side isn't a hateful stressfest. No amount of remote work or flex schedule will make up for absolutely hellish work environment, and I had something like that once. (I ended up taking a 30% pay cut that I could just barely afford and got the heck out)
So I'm sorry that I can't speak to the kernel work itself, but instead the framework for making the decision. If the work/life balance will remain constant then you're only risking the possibility that the work won't be any better, and hopefully no worse. If the work/life balance for the new job is also uncertain then (for me) that would be a bigger risk consideration.
Finally, you should consider the worst case scenario: The work is worse, the work/life balance is also worse. How easily can you shift to something better? If you have the chops to work on the Windows kernel then I'd wager that if you hit the worst case scenario you could get out of it without too much trouble, but you're the only one who knows whether or not that's true.
Best of luck to you, sláinte, and may the wind be always at your back.
Congrats on your kernel development offer from Microsoft! I imagine these to be very scarce these days, considering that today's Microsoft seems to be more into services and less about low-level software development.
If you like to chat more about this, just drop me a line. Always interested to hear from low-level devs and aspiring ones. My website/e-mail is in my HN profile.
It’s very readable, if a bit idiosyncratic. I would work on it again, given the right opportunity.
if (offered_nt_kernel_job) accept_right_away()
You can always quit if it doesn't turn out well.
You'll work with people great at their job whom you can learn a lot from.
I also loved having my own window-side private office. I've since heard that Microsoft was trying to adopt open offices, but if it's still there, awesome. Remote work is okay too of course.
https://www.amazon.com/Old-New-Thing-Development-Throughout/...
Also read - Showstopper
I know, it gets a lot of trashing all around, the ads, the ui, the whatever - I like the UI and I really like what it offers me for a technical person: ETW traces, Performance monitor, Event fowarding, PowerShell, Windows Terminal, GPO, Windows Defender with advanced features, Resource Monitor, ACLs (Well, auditing file access could have been better :) ), wf.msc, RSAT, HyperV and whatever features under the hood that enables Process Monitor!, ..., ... ,...
I’m not sure this is really a difference between kernel and backend service development. You can run the server on a dev machine, attach a debugger, and reproduce the bug. If your problem is you can’t reproduce the bug, you may encounter the same issue when working on the kernel. If someone reports a kernel bug, you may not be able to attach a debugger to their computer, so you will need to reproduce it.
- a lot of bikeshedding
- a mailing list driven workflow
- poor continuous integration
Raymond Chen is a favourite of mine:
You'll run into a lot of the same issues on the Windows kernel that you're getting irritated with in service development land. A lot of infrastructure is already out there, but figuring out how to use it effectively ends up being almost as much work as just rewriting it from scratch. Things are documented but poorly.
For Microsoft specifically you'll often get stuck because of problems with some other team's code, and then find yourself embroiled in multi-week battles with multiple engineers, managers, project managers, and product managers all at each other's throats.
All that said, the Windows kernel itself is a work of art IMO, and kernel development is a lot of fun. I think that if you want to make the big bucks but still work with c/c++, it's getting to the point where your choices are very limited: Microsoft, Google, Facebook, Apple, or Tableau.
Microsoft was terrible to work for, but they aren't as bad as a bad small company. At least there's HR to report gross violations to, they have a TON of great perks, the money is really good, they have good work-life balance, and it's not too hard to change teams (but if you find yourself wanting to jump to a different team, do it well before performance reviews come along, since your manager will definitely give you a bad review and hamstring your ability to move around within the company.. so you'll end up having to change companies. Not the worst thing if you do so, I guess)
EDIT: I just reread your list of complaints.. timing and consistency in the kernel are WAY more complicated than your average web services. There's also a ludicrous amount of red tape around getting anything done. I'm starting to think that you should look at Microsoft as a stepping stone in your career rather than its final destination.. it's still a good place to work for overall compared to random small companies, it'll give a boost to your resume, but it sounds like what you actually want to work on is game emulators or raspi home automation gizmos in your free time.
I think what you should do is, if possible, arrange for a short call with some of the team mates that you are going to work with. Just a casual one about how the kernel team day to day work looks like there. If not with the team, then maybe with your manager.
Unless someone from that team or some ex-kernel team member replies here, all the rest of these posts are just speculative and not worth basing your decision on.