HACKER Q&A
📣 peterth3

Is there a search tool that indexes a curated subset of the web?


Sometimes find myself appending filters to my search queries, like “site:reddit.com”, “site:github.com”, and maybe even something like “site:x.com OR site:y.com OR z.com”. I know I’m not alone, because there was a discussion on here yesterday [0] where other devs shared that they’re doing the same thing.

From my limited view of the search engine market, I see see successful search companies exist on one side of two extremes. Either:

1. Index the entire web. This category would include companies like Google, DuckDuckGo, Bing, ect.

Or

2. Index one website at a time. This would include companies like Algolia and Elasticsearch.

But there doesn’t seem to be any companies that focus on search use cases in between 1 and 2. As a software engineer it would be nice to have a search tool that only indexes sites like stack overflow, relevant GitHub repos, our dependencies’ docs, internal docs, and maybe a couple technical subreddits. I might even use this hypothetical search tool more than Google in my daily development work.

I’ve looked around for a search tool like I’m describing, but I can’t find anything that scratches my itch.

I did find YaCy [1], but it seems more technical and cumbersome than what I want. It’s focused more on other things like p2p/decentralization. And, it looks like it’s been around for a while without much traction. But maybe my assessment is shortsighted..

Does anyone here know a search tool like I’m describing?

Or maybe this is is a Tarpit idea… If so, then why?

[0]https://news.ycombinator.com/item?id=33799767 [1]https://yacy.net/


  👤 version_five Accepted Answer ✓
Someone did a Show HN that attempted this a while ago:

https://news.ycombinator.com/item?id=29774456

It uses Google custom search iirc