HACKER Q&A
📣 hahnchen

Decent, open source search engine?


I have a lot of bookmarks of blogs and I would like a search engine where I can search exclusively within those sites.

Ideally, I want the search engine to crawl the links I give it so I can search the subpages of the sites too.

I know google has a “site:” parameter but I dont think I can put 50 sites into one google search


  👤 kordlessagain Accepted Answer ✓
I started https://mitta.us as this, but am pivoting to prompt management for GPT-3. I've Open Sourced the code for the crawler here: https://github.com/kordless/grub-2.0. The entire system uses Google Vision for extracting text. I dislike fiddling with the DOM...

If you are interested in using Solr for this, I can provide instructions to you. I'm kordless at the gmails ... com. I'm going to release the entire code for doing this with the UI once my pivot is done.

If you drop me a line, I can let you preview the UI.


👤 magnio
https://github.com/a5huynh/spyglass

> A personal search engine, crawl & index websites you want with a simple set of rules


👤 pull_my_finger
It's still in development, but you could checkout searchhut[1] by the people from sourcehut.

[1]: https://git.sr.ht/~sircmpwn/searchhut


👤 gardenfelder