I have worked with graph databases for about a year. I am becoming concerned about their effectiveness.
When googling it seems that all discussion is something like two or three years old, and that there is no recent development. The solutions offered often appear as kludges. "The (sorry) State of Graph Database Systems" talk by Peter Boncz kind of sums it up.
More specific, my concerns are as follows:
Performance: The elegant Cypher query language makes it easy to construct queries, but the actual execution is often not performant. It appears that only very simple query strategies are utilized in the database engines. Are the graph engines actually capable of analyzing the data, in order to plan for best performance? Something as elementary as recognizing tree structures inside the graph and traversing them in a performant manner?
Ingress capability: For moderately sized queries the query planner is either extremely slow or just unable to analyze the query at all. This happens also for insertion of what is essentially a sub-graph, touching only one or two existing nodes in the database.
Memory usage and startup time: Are the database engines just "cheating" by reading everything into memory? With corresponding long startup time? If that if so, why should I not just create my own engine, tailored to my specific data?
Scaling: This is related to the above mentioned issues. Are graph databases currently just toys?
My hands-on experience so far is with RedisGraph (including FalkorDB) and Neo4j.
I assume you really like the RedisGraph, now if you think about how much time of development did the RedisGraph had, 5 years? For example, Postgres is 37 years old :D
Horizontal scaling is hard in GraphDBs due to the nature of how the graph is structured and how you interact with it. Jumping across different servers is expensive. This is not a trivial problem that requires a lot of work and investment that Graph-based engines didn't have time to receive.
When we come to the Cyhper, it is easy to shoot yourself in the foot; that is true. It takes time to get used to tricks with Cypher and to know how to use it properly.
Performance, ingress, memory usage, etc is very specific per each system architecture so it is hard to comment.
This brings me to the actual workload, GraphDB/Graph analytical engines, and Knowledge Graph solutions will have their place under the sun, the more complex the use-case the more value the products will bring.
Just look what Deep Mind is doing with GNN and graph tech today: https://deepmind.google/discover/blog/millions-of-new-materi...
But what Peter said it should not be used in simple use cases where the cost of ownership does not bring value because the underlying problem is fairly simple, and Postgres is Free after all.
Disclaimer: I work for Memgraph https://memgraph.com
I bet somebody will raise a similar question in a few years time when the list under https://db-engines.com/en/ranking/graph+dbms will be bigger.
DISCLAIMER: Coming from https://github.com/memgraph/memgraph