HACKER Q&A
📣 glouwbug

Why don't non-relational databases just treat rows as files?


Assuming the operating system's file system is built on some B+tree, the filename is the key (a UUID) and IO calls like fopen() and fclose() will have O(logn) access time. Files, while being modified, can be locked to prevent race conditions. Yes, there is inode overhead (1 million files in a folder is ~40MB), but I'm not using a 1994 desktop machine anymore.

What am I missing?


  👤 PaulHoule Accepted Answer ✓
I learned never to trust locks in real POSIX environments. There's a tradeoff being protecting integrity and liveness and the UNIX culture is to prefer liveness.

In Microsoft Windows locks have a lot more authority, such that it is possible for two Windows machines to connect to a Microsoft Access database (file-based, a lot like sqllite) over a file share and not conflict with each other.

Windows machines go out to lunch a lot waiting for locks to clear so that is your tradeoff.


👤 prirun
File systems usually allocate space in 4K chunks. If every row is a file, every row will have an average wasted space of 2K.

Also, read/write efficiency decreases by an average of 50% because of this wasted space. It's similar to why backup programs (I'm the author of HashBackup) seem to go so slow on small files: if the average file size is 96 bytes, 4000 bytes (more than 97%) of I/O time is wasted reading such small files.


👤 vivegi
Please read about I/O, memory and cache hierarchy. If you access a hot row, 10,000 times in a minute, that would be 10k fopens() / fcntl() / fclose() calls. Benchmark that in a C program and also benchmark pre-allocated memory page access with a cache in a loop 10k times.

That is just reads.

If you include writes + transaction logs + join queries etc., individual files become unwieldy very soon.