Where should this data be stored? Is it considered acceptable for the web server to just INSERT every event directly into a SQL database table? If so, then at what volume of throughput does that break, and how should one handle higher scale?
Let's say that this is for a website where users can generate content (eg. Youtube) and view detailed analytics on that content.
It's common to ingest logs into something like elasticsearch, for performance and reliability reasons.
This is a common enough problem that MongoDB Atlas has a feature that exposes searchable data through some lucene-based backend.[0] Never used it but found the concept to be interesting because it fits the convenient working pattern of "shove it all in the DB and figure it out later."
postgresql for transactional logs
clickhouse for analytics data
elasticsearch or quickwit for terabytes of data, disk persisted, if i need thorough search on structured jsons
---
others i use for different use case
typesense for searching mbs to gbs of data, memory persisted
redis for caching kbs of data, memory persisted
You can use something like clickhouse [0] for example or use 3rd party SAAS solutions like posthog [1] etc that are built on top of clickhouse
[0]https://clickhouse.com/blog/analyzing-aws-fow-logs-using-cli...