HACKER Q&A
📣 getcrunk

Is there a resource to help choose the right DB for you needs?


Is there something a junior developer can use that explains the various design/philosophical tradeoffs of various databases which among other things help avoid gothchas down the road (i.e. in regards to scaling/performance)? Something that is also practical as opposed to theoretical (talks about actual dbs and not abstract).

I already understand the basic difference between sql/nosql and the CAP theorem (undergrad). But now at the beginning stage of an app I’m working on I’ve spent the last few hours looking at the indexing, querying and scalability options of various databases and feel overwhelmed.

Some areas of interest off the top of my head:

  - Sql/nosql
  - Eventual vs immediate consistency
  - Rest or http
  - Orm options
  - Sharding/presharding and clustering 
  - Scalability/redundance
  - Indexing, secondary indexes
  - Querying
  - Realtime db (what even is that)
Then there is also the performance impact at larger scale based on the effects of the previously mentioned factors. Couchdb has multi-master replication, but from my reading writes need to happen to all replicas (an understandable tradeoff, but I don’t know is this the only way. Seem like it but I don’t want to have to decide that. I want someone relatively competent resource to lean on to know this other than my own synthesis.) Mongo’s eventual consistency has problems with potential data loss (seems like a normal result of eventual consistency). Mongo can have secondary indexes but other databases cannot. Couchbase allows scaling out a lot easier than couch db (no resharding). Postgres cant scale out easily but scales up well. For a mongo query you have to pull the whole document (that's why they recommend keeping it flat)

And I’m sure this just scratches the surface, but I hope there’s something you people can refer me to that’s short of an entire or multiple db courses.


  👤 robertbalent Accepted Answer ✓
As for a practical advice:

Just use PostgreSQL. You can get really really far with just an SQL database and few indexes. It's enough for 99% of projects.

This has two big advantages:

1. PostgreSQL is a reliable, high performant database, which has been proven over time. So you are not going to deal with weird issues and can focus on your project.

2. You will learn SQL which you are going to use throughout your whole career.

And if you decide to go this way, read this: https://use-the-index-luke.com/


👤 joshxyz
My cheat sheet is: For transactions, postgresql. For analytics, clickhouse. For cache, redis. For search, elasticsearch. For objects, anything s3 compatible.

Anything else, I really have to do in-depth research and comparisons because 1.) Each database has their own pros and cons, 2.) These pros and cons between dbs tend to overlap and has each of their own gotchas, and 3.) These pros and cons also improve overtime where some problems brought up earlier might be addressed already in the lastest releases.

CMU Database Group's YT channel is also worth checking out for introductions on some databases.

https://m.youtube.com/channel/UCHnBsf2rH-K7pn09rb3qvkA


👤 xupybd
Orms are generally a lot of extra work down the track. They do 80% of things very well and save loads of time but you will spend an amazing amount of time on the other 20%.

Learning SQL is worth it.