HACKER Q&A
📣 ratpik

How to build full text search at scale?


Data Store - Elasticsearch Scale - 10 million writes/day (500 GB/day), about 100K search queries per day

Trying to figure out

1 - How to control access to data (multi-tenancy where there are ~100K tenants)

2 - Database design - Indexes and Shards and best practices around mixing different types of documents in a single index.


  👤 itronitron Accepted Answer ✓
I recommend writing down what exactly scale means for your needs. Number of users? Number of queries? Number of sources? Number of 'result-sets'? Number of documents? Number of text fields?

Elasticsearch is built on-top-of Lucene which is a Java API that you can use in pretty much any application. If you already have a system that can search the MySQL clusters then I would recommend hooking Lucene into that system instead of standing up another one.


👤 bufferoverflow
I doubt you will get a useful answer in a comment. Each of your questions is very broad. And you didn't even select a search engine yet. You didn't specify the scale you're dealing with. You didn't specify the number of reads/writes per second that you expect.

Choose one system and learn it well.