HACKER Q&A
📣 aster0id

How do you handle high cardinality configuration that changes often?


I define high cardinality configuration as configuration values that depend on multiple input variables with (relatively) high cardinality, while the output is usually single valued / low cardinality.

For example consider a company that offers a service and charges a price for it. The target configuration value here is the price that needs to be shown to the customer. The company operates in all 50 states, but charges a different price in some states due to regulations. They also have other variables that impact pricing such as membership tiers, age of customer, etc. They might also want to experiment with a different price for a subset of customers in a particular state. Here the input variables are high cardinality (50 states × 5 membership tiers × 3 age groups), but the target value is just price, which potentially has 5 different values.

I would imagine that the above scenario would be handled through a mix of configuration values for the general case + special casing if-else logic in code for corner cases that cannot be easily represented in configuration.

The main concern I have is splitting of the actual business logic between the configuration + special casing code. I would personally prefer if I just had a magic dictionary that would give me the price I need to show a customer given their state, membership tier and age, and the map would contain all the business logic.

I imagine one argument would be that "the code + configuration is the magic map you're looking for". But I've seen this pattern blow up in many different ways in my admittedly limited 5 years of experience as a backend software engineer. For example, some external service handles the logic to override the price based on a particular variable, which blows up things in an unpredictable way when changes are made in that service. My example is not perfect because it reeks of bad abstraction, but I just wanted to make the point that having multi variate rules is hard to write and even harder to maintain.

One more way I've seen things become unmanageable is when the number of special case if else statements blows up such that no one understands what will actually happen if you add a new variable to the mix, or add a new condition to an existing variable. This again is not a perfect example because you should ideally have tests to have confidence to make changes, but the point remains.

I ask this question because I was considering building a configuration store that can store multi variate hierarchical / range based configuration, so that all logic lives inside the configuration, and can be changed easily and frequently. Not sure if something like this already exists. Please let me know if it already does! Would love to use it instead of maintaining classes full of if-else statements.


  👤 stop50 Accepted Answer ✓
I guess you want sometging like this: [prices] new_membership_rate = (0.1 + 0.9 ^ TIER * AGEGROUP) * STATE_TAX

  [[prices.agegroup]]
  min = 15
  max = 18
  name = gen z
  modificator = 0.75
  
  [[prices.agegroup]]
  min = 60
  max = 120
  name = boomer
  modificator = 1.5