Why do you think that is? Are there possibly other projects out there that I'm not familiar with?
- https://github.com/DataManagementLab/ScaleStore - "A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA"
- https://github.com/unum-cloud/udisk (https://github.com/unum-cloud/ustore) - "The fastest ACID-transactional persisted Key-Value store designed for NVMe block-devices with GPU-acceleration and SPDK to bypass the Linux kernel."
- https://github.com/capsuleman/ssd-nvme-database - "Columnar database on SSD NVMe"
Given however, that most of the world has shifted to VMs, I don't think KV storage is accessible for that reason alone because the disks are often split out to multiple users. So the overall demand for this would be low.
I mean, using a merkle tree or something like that to make sense of the underlying data.
Otherwise though…you have the file system. Is that not enough?
[1] These slides claim up to 32 bytes, which would be a practically useful length: https://www.snia.org/sites/default/files/ESF/Key-Value-Stora... but the current revision of the standard only permits two 64-bit words as the key ("The maximum KV key size is 16 bytes"): https://nvmexpress.org/wp-content/uploads/NVM-Express-Key-Va...
Utilizing: https://memcached.org/blog/nvm-caching/,https://github.com/m...
TLDR; Grafana Cloud needed tons of Caching, and it was expensive. So they used extstore in memcache to hold most of it on NVMe disks. This massively reduced their costs.
The Azure Lv3/Lsv3/Lav3/Lasv3 series all provide this capability, for example.
Ref: https://learn.microsoft.com/en-us/azure/virtual-machines/las...
Even more complex when you want to have any kind of redundancy, as you'd essentially need to build-in some kind of RAID-like into your database.
Also few terabytes in RAID10 NVMes + PostgreSQL and something covers about 99% of companies needs for speed.
So you're left with 1% needing that kind of speeds
You might also be interested in xNVMe and the RocksDB/Ceph KV drivers:
https://github.com/OpenMPDK/xNVMe
I like how you reference the performance benefits of NVMe direct addressing, but then immediately lament that you can't access these benefits across a SEVEN LAYER STACK OF ABSTRACTIONS.
You can either lament the dearth of userland direct-addressable performant software, OR lament the dearth of convenient network APIs that thrash your cache lines and dramatically increase your access latency.
You don't get to do both simultaneously.
Embedded is a feature for performance-aware software, not a bug.
https://github.com/aerospike/aerospike-server/blob/master/cf...
There are other occurrences in the codebase, but that is the most prominent one.
> NVMe SSDs based on flash are cheap and offer high throughput. Combining several of these devices into a single server enables 10 million I/O operations per second or more. Our experiments show that existing out-of-memory database systems and storage engines achieve only a fraction of this performance. In this work, we demonstrate that it is possible to close the performance gap between hardware and software through an I/O optimized storage engine design. In a heavy out-of-memory setting, where the dataset is 10 times larger than main memory, our system can achieve more than 1 million TPC-C transactions per second.