HACKER Q&A
📣 ezekg

What do you use to power search for a static site?


I want to add search to my company's API docs, which is a static HTML website. I originally wanted to use Algolia's DocSearch after seeing it in action at Fathom [0], but I applied for access and according to my rejection letter, it's apparently only available for open source projects (yet I know of more than a few businesses using it... weird.)

Then I looked into using Algolia's main product, but I couldn't figure out how to set it up. Very poor onboarding experience. I got frustrated with the constant paywalls just trying to piece together how the product would work -- "request access to Crawlers", "contact us to learn more." At this point, I'm not even sure if Algolia supports static sites.

What I'd love is to have an automated crawler peel through my site and automatically create an index, and then give me a library that I can use to include a search bar on my site.

I want to pay for a quality product. I do not want to self-host or manage my own index.

Are there any alternatives for static sites? What do you use?

[0]: https://usefathom.com/docs


  👤 seanwilson Accepted Answer ✓
How big is the website?

This was a while ago so maybe there's better libraries for it but I've integrated https://lunrjs.com/ before for searching a few 100 FAQs with lengthy answers. You create a static search index file of your articles at build time that's served to the client (it was ~20KB compressed) to search with using JavaScript.

The indexing file will grow with the number of documents you have but not sure at what stage this approach becomes impractical (does anyone have any benchmarks?). Worth looking into because you can create a completely custom search UI that updates instantly plus there's nothing extra to host or pay for as it's all static.

Edit: This lists similar libraries plus benchmarks: https://github.com/nextapps-de/flexsearch


👤 jil
It doesn’t do live crawling, so might not be quite what you want, but I built Stork Search (https://stork-search.net) to solve full-text search for static sites.

Today, you’d run a binary as part of a site’s build or deploy process, feeding in the input files. It generates a search index which you deploy alongside your site. The project’s JS library will load that index and turn it into a client-side interactive search interface.

I’d be curious to see if this sounds interesting or workable for you - you mentioned that you don’t want to host your own index, but does that change if “hosting the index” feels similar to hosting an image, instead of spinning up a server?

I’d be interested in building a paid addition that will crawl your site & host the index - you’re probably the 2nd person I’ve seen with that suggestion. Please let me know if you’d be interested in being a beta user.


👤 csteubs
A colleague of mine created TypeSense (https://typesense.org/) a few years ago. I think it may make sense for your use case.

👤 gunnarmorling
I'm using Lucene (Java library for full-text search), compiled down into a native binary using GraalVM and Quarkus, deployed on AWS Lambda. Discussing the entire set-up here: https://www.morling.dev/blog/how-i-built-a-serverless-search....

👤 WorldMaker
Sphinx [1] the documentation tool from the Python world has just about always supported (at least a decade or more at this point) a very simple keyword search based on a simple JSON file it compiles at generation time and a tiny bit of JS to read the JSON and spit out the results.

It generally works really well and it makes sense that a static site could use a static index. I'm still surprised more documentation tools haven't copied the approach.

[1] https://www.sphinx-doc.org/en/master/


👤 Something1234
I built a simple search using SQLite as part of the build process for my static site. It uses AWS lambda to actually serve up the results.

👤 daibo
I run a Search as a Service startup, would like to sponsor your doc search. Shoot me an email or dm me on twitter @linh_at_anvere


👤 daveevad
I've used Solr in the past for this use case.

👤 aqaq2
> I want to pay for a quality product. I do not want to self-host or manage my own index.

Why? ain't very hard and for a small site, it is very fast too...

I used Lucene in the past and had zero complaints