What distributed file system would you use in 2024?

Question

What distributed file system would you use in 2024?

What distributed file system would you use for a greenfield homelab project today?
Requirements / desires:
* Reliable
* Performant
* Easy to setup and operate
Some options:
SeaweedFS - https://github.com/seaweedfs/seaweedfs
289 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=seaweedfs&sort=byPopularity&type=all
JuiceFS - https://github.com/juicedata/juicefs
2047 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=juicefs&sort=byPopularity&type=all
MooseFS - https://github.com/moosefs/moosefs
126 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=moosefs&sort=byPopularity&type=all
Do people still use Ceph or Gluster? I don't think they qualify as "easy to setup and operate".
Thanks!

antongribok · Accepted Answer

CephFS works great at home. I've got an all-Linux setup, and it works great on Laptops as they go to sleep and resume. Works great over Tailscale on a modile hotspot too.
My home cluster is running on some 7 Raspberry Pis currently, so it's not very performant, but the uptime over the past 2-3 years has been unbeatable.
With cephadm it's very easy to stand up, and upkeep has been basically zero.

pubg · Answer

Clicky links:SeaweedFS - https://github.com/seaweedfs/seaweedfs289 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...JuiceFS - https://github.com/juicedata/juicefs2047 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...MooseFS - https://github.com/moosefs/moosefs126 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

wmf · Answer

Rook is the easy to operate version of Ceph. I would definitely consider it since it's far more mature.

bravetraveler · Answer

I still use Gluster, my thoughts on it are complicated.
I would absolutely not call it easy to admin... but I've enjoyed it. It's very workable.
I've brought volumes back from ashes with only their inode-local metadata! Yet, '/var/lib/glusterd' remains a liability somehow. The details of that recovery escape me.
With RDMA it performs like local-attached storage. Performance is great, but once the thing is off rails... Eesh.
Setup is deceptively easy. There's a few reasonable configurations, far more that are less than ideal. The topology you decide casts a lot of lines/decisions.

hi-v-rocknroll · Answer

HPSS or GPFS if you could get either. ;)
BeeGFS also locks away HA behind $$$.
Sure you could go with the Ceph easy button, but you could play around with Lustre+ZFS or pNFS. [0,1]
0. https://wiki.lustre.org/Managing_Lustre_with_the_ZFS_backend...
1. [2017, pdf] https://events.static.linuxfound.org/sites/events/files/slid...

JonChesterfield · Answer

Based on documentation and reading various online things I was convinced to go with glusterfs. Then read some of the source code before rolling it out and reconsidered.
Ceph is probably solid but not simple.
Thus no distributed filesystem for me after all, one fast box running zfs instead. Couldn't convince myself that a ceph cluster would be more reliable in practice than the single box zfs with backups.

freefaler · Answer

We've been using glusterFS in production for several years. It is slow and can be slow, especially when nodes experience higher network latency. When writes are synchronously executed across all the nodes we've seen higher load averages on servers.However it has been working reliably. So for non-critical time constrained loads it's good enough.

tfolbrecht · Answer

JuiceFS requires a backing object store like s3/minio that it &ldquo;flushes&rdquo; to persist data. If you want to run it distributed it uses a few other services.I&rsquo;d use object storage if I wanted simple in the replicated, jbod disk management and access control senses of simple. If I had to use file storage I&rsquo;d use boring old sftp or smb on top of zfs on a NAS

robinhoodexe · Answer

We have been using seaweedfs for about 3 months and so far it&rsquo;s pretty solid, the integration with kubernetes is nice and the maintainer is very active on slack and when opening GitHub issues.

edward28 · Answer

Ceph is still the way to go. Proxmox helps for a homelab ceph install.