Some things in particular I'm trying to figure out: * As with all dev work, a large portion of work is bug hunting, but what kind of bugs do DBs usually have to deal with?
* Do developers spend a lot of time on optimization or is this mostly just a concern that's figured out during initial development?
* What educational prereqs are there? Do employers (strongly) prefer a Masters or even PhD?
* How is the job market for this kind of work? Obviously demand is going to be much lower that your standard webdev job, but how is the demand/supply imbalance?
* What employers hire these developers? Is it basically just FAANG and specialty companies a la Cockroach Labs?
Thanks to anyone who takes the time to respond!
Start with relational, understand why it is the reference architecture, and from there the tradeoffs involved and what other architectures bring to the table (columnar, streaming, object, in-memory, array, distributed, blockchain, nosql, etc)
To really understand why you should start with relational, read Stonebraker's classic paper: "What goes around comes around": https://people.cs.umass.edu/~yanlei/courses/CS691LL-f06/pape...
It will teach you database evolution history so that you don't end up reinventing the wheel.
Stonebraker's MIT course: https://ocw.mit.edu/courses/electrical-engineering-and-compu...
There are a few lectures of this course in youtube, not by him: https://youtube.com/playlist?list=PLfciLKR3SgqOxCy1TIXXyfTqK...
MIT's distributed systems course also touches on databases: https://pdos.csail.mit.edu/6.824/schedule.html
Course by one of his disciples: https://15721.courses.cs.cmu.edu/spring2020/
Yet another disciple (in edx too): https://m.youtube.com/playlist?list=PLYp4IGUhNFmw8USiYMJvCUj...
The red book: http://www.redbook.io/
For learning SQL really well, including relational algebra, I like this course: http://users.cms.caltech.edu/~donnie/cs121/
A few examples, Scaling to more CPU cores, Scaling to bigger RAM, bigger disks, scaling to larger systems. Introduction of new HW such as Intel Persistent Memory, new GPUs that can be used for various purposes in a database such as compression and encryption.
Every product has weaknesses that needs to be addressed, what these are is obviously product dependent.
Personally I spent a very significant amount of time the last 5 years to automate algorithms such that they automatically adapt to load, memory sizes, VM size and so forth.
Masters are definitely ok, but my Ph.D studies have certainly helped since that made me to do a deep dive into all database algorithms. So masters is sufficient to be a database developer, but I would say a Ph.D is a good idea if you aim for a database development architect role down the road. Best of luck in your new tasks.
My last job was at a company that makes data storage systems [link redacted]. That thing probably doesn't look like a database to most people, but that is exactly what it is under the hood. We had quite a few ex-Oracle people on staff too, and their skills were very useful.
The bugs were pretty fun actually. We've had to deal with network card firmware corrupting frames, a CPU bug, PCIe issues, and of course the much more numerous (and mundane) kernel bugs and run-of-the-mill null pointers and memory leaks.
And before anyone says "just use Rust": the company started many years before Rust was a thing and there was simply too much to rewrite.
There is almost always room for a good generalist developer in a company like that. You don't have to be a domain expert to join. But of course there will also be some people with PhDs on staff. Learning from them is another draw.
I'm curious to hear some general answers
As data needs will vary so does skills and knowledge. Best to figure out what sort of field of data that interests you and learn towards the environments that are used in it.
https://jira.mongodb.org/projects/SERVER/issues/SERVER-52300...
Curious detail is Jira itself doesn't officially support MariaDB.