HACKER Q&A
📣 gautamsomani

Which DB internals to study more to become expert in DBs overall??


I started my career as a Network Technician/Engineer (5 years), then moved to Linux Administration (5 years), then DevOps/SRE (8 years). Note that I never went to college and never had any formal CS education, learned all by myself, and hence have serious gaps in my knowledge in CS.

During my stint as a Linux admin and as an SRE, I came to the realization that Databases seems to be a good field to dive deep in since no matter where I work and in what ever technical environment, the DB beast will always influence things of the design and working of it (correct me if I am wrong here).

Initially I studied MySQL, then Hadoop a bit, then Kafka (not exactly a database system). Have worked on Redis, Cassandra and Yugabyte also. Studying these I also realized that each differs a lot at times, each meant for a different purpose, and share very little internals with each other.

If I want to continue studying and building my knowledge in databases, what should be the right path for me. Right now I am reading the Codd paper, plan to finish the Database Internals book, followed by DDIA. What else should I study? Can you all please recommend me some must read papers for databases (SQL/NoSQL both), books, and database technologies to follow (via mailing lists etc)? Would reading the complete official documentation of other databases be enough too or should I deep dive into all of them?

I really want to be an expert in Database, and have no limit on the time frame. Am just 36 now and have lot of years of my life to devote to this study and become knowledgeable. Looking for your expertise, knowledge and valuable experience to guide me.


  👤 epelesis Accepted Answer ✓
Here are some popular papers: https://github.com/rxin/db-readings

Once you finish the major papers, CMU has a ton of talks from more recent database research projects: https://db.cs.cmu.edu/seminar2020/

You could also try building a simple database from scratch, like this: https://www.awelm.com/posts/simple-db/

Also if you are specifically interested in SQL, make sure to learn relational algebra/calculus.


👤 sargstuff
?? DDIA ??

Author Joe Celko has published a number of books on various database topics ( puzzles, advanced sql examples, "standards", nosql, thinking in sets, etc)

Based on info provided:

The field of data science would be an area covering 'database systems concepts' / use cases:

Use cases for different types of database conceptual/physical/logical models. [1]

Use cases for underlying data structure -- aka flat, key pair, wide column, document store, graph, etc.

Learn programming language / framework (stand alone/specific database 4g/ pandas / "web") to write database application(s).

-------------

[1] https://en.wikipedia.org/wiki/Database_model


👤 avinassh
Shameless plug: I am working on an educational project to teach writing a key value store from scratch. It is in TDD fashion with CI/CD setup, so you keep making changes to pass tests. Once you pass all the tests, you will have a functional disk based key value store

https://github.com/avinassh/py-caskdb


👤 init
The CMU Database Groups YouTube channel has a very good material on this. I highly recommend the Intro to Database Systems videos: https://www.youtube.com/c/CMUDatabaseGroup/playlists

👤 dmfay
de Haan & Koppelaars' _Applied Mathematics for Database Professionals_ is well worth your time, especially if you haven't studied discrete math yet.

👤 pighive
Following.