This seems like a huge problem IMO, there's so much knowledge out there that we as a civilisation can't afford to have it simply lost to entropy.
This led me to wonder, what would it look like if a community of people decided to try to build an alternative; but instead of it being closed & commercial like Kagi (not trashing Kagi, to be clear), it was a proper open-src based, decentralised system more akin to the wikipedia model in terms of the actual operations/organisation.
I feel like most people agree that the "Golden Age" of Google result quality was well in the past... at least 10 years ago. So whatever technology requirements there are are obviously not "novel". Additionally, the explosion of tools like LLMs, particularly LLM Embeddings that enable proper semantic search, would have to make the task significantly more achievable than it would have been say... 10 years ago.
Obviously scale is a huge challenge, but not insurmountable. Back of the napkin maths says it's roughly 1TB of storage per billion indexed pages (making a huge assumption that each indexed page consumes ~1KB of storage).
Google level performance isn't really needed IMO, I'd gladly wait 3-4 seconds for search results if it meant I actually got what I needed. Obviously that would still be astronomically hard at that scale, but I feel like if you accept performance 1/10th of Google, it would at least put it into the realms of possibility.
I'm obviously missing a million different variables... but I can't help but wonder, what would it look like to take back control of "Organising the worlds information"?