HACKER Q&A
📣 sio_narancsle

Serverless Full-Text Search


I'd love to have a minimal but functional full-text or fuzzy text search solution for data stored in DynamoDB. I've looked at countless solutions and have implemented a bunch on my own (dynamo stream -> elastic/tantivy/orama), but still haven't got a satisfying result. So I have now set a goal for myself to write a library.

I'm still researching the topic and would love to ask for some points, good reads and similar advice. The requirements would be:

- to only have a single file as an index (so easily used in S3 for example). i want the library to only return IDs upon search so i don't want to store the actual data to save on index size. the index would be loaded into the memory of eg an AWS Lambda upon startup. (or kept on disk with EFS, whatever is better)

- support at least 1 million records with around 200-300ms response time.


  👤 sio_narancsle Accepted Answer ✓
How should the data structure look like if I want to store it on disk instead of memory? How can I read only a portion of let's say a 10gig index file stored on disk when a query comes in?