Why do search engines not let you blacklist spam domains?
Whenever I'm searching for anything even mildly off the beaten path, it's not uncommon for the top results to be SEO stuffed spam websites, or maybe even real websites that I can't access (like paywalls or requiring adblocker exceptions to proceed). Usually pages from the same domains are top-ranked for other related searches too.
As a user I'd love to be able to tell my search engine to "Never show me results from this domains" (similar to blocking an account on Twitter) – but as far as I can tell there is no way to do this in either Google or DuckDuckGo search.
This seems like such low-hanging fruit to me that I'm wondering if other people have ever wanted this, and if there's actually a reason not to do it.
In my opinion, back then, they needed the data as a training set for spammy domain detection. Now that SERP spam is no longer a serious issue (in Google's eyes anyway), why bother. Google always knows what's best for you.
Because Google and Bing (therefore Duck) are increasingly answer engines, keeping you on their page, supporting the SEO ecosystem and most importantly their ad revenues and network customers.
We and the other few independent search engines have not made enough dent on the market to suffer SEO spam. We'll have a way to deal with it (watch this space). Right now you'll certainly get results "off the beaten path" and with one click you can try out 8 other search options [0].
[0] https://blog.mojeek.com/2022/02/search-choices-enable-freedo...
For the same reason that streaming services don't really give you the ability to filter by cast/crew or hide stuff you've already seen: to gently guide you into avenues that are more profitable for them, regardless of what you say you really want.
uBlock Origin static filters to the rescue!
Block results from specific domains on Google or DDG:
google.*##.g:has(a[href*="thetopsites.com"])
duckduckgo.*##.results > div:has(a[href*="thetopsites.com"])
And it's even possible to target element content with regex with the `:has-text(/regex/)` selector.
google.*##.g:has(*:has-text(/bye topic of noninterest/i))
duckduckgo.*##.results > div:has(*:has-text(/bye topic of noninterest/i))
Bonus content: Ever tried getting rid of Medium's obnoxious cookie notification? Just nuke it from orbit on all domains:
*##body>div:has(div:has-text(/To make Medium work.*Privacy Policy.*Cookie Policy/i))
Adding -site:baddomain.com still seems to work in both Google and DuckDuckGo. You should be able to include that in the URL template used for search so it gets added to all queries. You can build your blacklist that way. E.g.
https://duckduckgo.com/?t=hk&iar=images&q=-site:pinterest.com+-site:flickr.com+%s
As an aside, I haven't used Google in a while, and I find it interesting how the first page shows only like 5 results and at the very end. The rest of the page is widgets like "top stories" and "people also ask".
It's a simple answer because you don't pay them for that. They run a balancing act of providing you useful results while also surreptitiously shoving trash, that they actually make money off of, at you and hoping you won't notice or complain.
Because ultimately there is a conflict of interest between you as a search user, and google actual customers, advertisers who are often spammers themselves. Someone needs to pay for google search and since it isn't you...
Google might have been better in the past, but since there is absolutely no serious competition whatsoever from a market perspective, Google technically doesn't really need to care about the quality of its search results anymore, only maximizing profits.
Kagi allows you to weigh domains.
If search is free, then you are the product. Now if there was a paid search engine, and you were the customer, you would expect customer service, customization, no ads, privacy protection, etc.
Google in particular doesn’t care about what features the small minority of power users would want. Otherwise they wouldn’t have removed so many over the years.
Gotta do it on your end. The uBlackList extension works for several search engines in Firefox and Chrome.
I wish I could do this to tell DDG that I don't want to see any amazon.com products in my listing. I fell out with Amazon years ago and have been shopping independently since, but they have a stronghold over search engines with their out-of-stock listings.
Doesn't answer you question. I wish google would use their enormous resources to fight this copy spam (and maybe its so hard that they are but you can't tell).
In my opinion, most mainstream websites are spam by now. I understand why Google won't let me vote on the issue because they'll surely not like the results ;)
Implicit signals are far more significant than explicit. Search engine already know which links are not getting clicks and which links are causing re-queries and which links causes pressing of back button.
People do dumb things if you ask them not to show results from a domain or even “report spam”. For example, Fox News and CNN will all be marked as spam millions times a day even after people are finding what they are looking for.
I think people are right that the motivation is profit, but I'm not convinced it's to manipulate you in any way. I don't think driving you towards SEO'd blog spam is really all that profitable to google.
My guess is that it's because it's an abusable feature, and that means hiring human moderators for it.
I've long wondered the same! I recently started using Kagi, and it does have that feature. I've already blocked several domains from search results there, and I think the results pages are better (for me, anyway) as a result.
The more time you spend looking for the result you want, the more ads you will see. Some of those spam domains are filled with ads, as well.
What if we all started clicking on every ad on every spam page? Would they eventually get kicked off all the ad platforms for fraud?
The only reason why Google and Facebook work is because their algorithms believe that we only click on the ads that we're interested in. If we click on every single ad, we will completely break their system within 3-6 months.
I see lots of “google is evil” narrative here… in reality they could add a feature to collect spam flags and still disregard user preferences. It’s just that the data will probably not help them, since adversaries are much more motivated to manipulate it for SEO profit than the average user who is unlikely to repeat that search
Because no product manager has been able to push it through various layers of beauracracy (yet).
Plus more customizations means, their ML based personalized SERP page has failed in understanding your intent/needs.
As a returning Diablo 2 player, it is sad that websites with useful information from 20 years ago are gradually replaced by SEO optimised crap gradually
Simple answer: Google doesn't make money from giving you the right results. Google makes money by keeping you searching.
Neeva has this feature. Configure sites you want to see more from and sites you want to see less from. Easy.
I want a search engine that let's me exclude sites that have ads. Or even just exclude javascript.
kagi does. i love it. whenever they move from beta to paid I'll be a paying customer
...because pinterest would lose all their pageviews.
The second this is added, the spam domains will just multiply. User interventions with spam doesn't work.