HACKER Q&A
📣 axioma

Deciding Cadence vs. Drools for Rules


I am trying to built of a rule checker for with expectation of large number of requests (~10K requests per second of sensor data). I have experience with Drools for building rules engine. However Drools does not scale and I don't need the flexibility of a full-blown engine. I recently came across Uber Cadence, it promises to be scalable and distributed by design. I have not used a workflow engine before but it appears this might work for simple rules. Wonder what the community thinks about it.


  👤 shoo Accepted Answer ✓
re: uber cadence (see also the commercialised open source fork https://temporal.io/ by some of the engineers who designed cadence) -- it isn't a rules engine, the problem it is trying to solve is highly reliable fault-tolerant execution of stateful processes. If that's what is hard about your problem, maybe cadence / temporal could be a good fit.

I am not sure how just how far you can scale cadence / temporal in terms of workflow executions per second, my understanding is that the architecture depends on persisting state that tracks workflow execution progress in some backing database (MySQL, cassandra, postgres, ...). The problem might reduce to how many operations per second you can get the database to perform.

If you are using cadence / temporal, be aware that the SDK doesn't have any understanding or model of things like rules engines or calling external APIs or so on. You'd need to implement all that yourself. You need to manually figure out how to partition your workflow code into side-effect free deterministic code -- which may be potentially executed many times during the same workflow run, if the system needs to replay workflow execution to recover a partial run after losing a worker -- and code that has side effects (actions, invoked by the pure functional workflow code). If you had some existing rules engine in java / go that was deterministic & pure functional, you might be literally able to embed calls to that in workflow code, since replaying execution would be safe.

If there's no serious business case for recovering your failed workflow executions -- unlike e.g. user onboarding or payment processing -- you probably have a much easier problem, or maybe one that is just hard in a different way, where introducing cadence / temporal doesn't help solve what's hard.