HACKER Q&A
📣 josephcsible

Why does Google rank the real Python documentation below content farms?


Do a Google search for "python endswith". Obviously, https://docs.python.org/3/library/stdtypes.html#str.endswith is the correct best hit for that term. Why does Google rank that page below all of the well-known low-quality content farms like GeeksforGeeks, W3Schools, and Tutorialspoint?


  👤 skilled Accepted Answer ✓
Because the page you've linked has 24,599 words in total, out of which only 43 words are dedicated to "str.endswith". You can make a beefy SEO article specifically on this method by talking about it a bit, and then providing a lot of examples in various different scenarios.

The bottom line being that there is nothing you can do about it unless Python themselves fundamentally change how the documentation is structured, which I doubt they have plans to do.


👤 cshimmin
Google can't make money serving ads if you go to the ad-free python.org.

Obviously I'm not saying they're singling out python or documentation in general as some kind of cash cow. More realistically the story is that, sites that serve ads make money, and can spend money on cat-and-mousing SEO to keep making more money. Technical docs aren't going to do that. Google could whitelist them but it seems they turn a blind eye for the aforementioned reason.


👤 zb3
Google search is now practically shit. Instead of adapting it's rank so it links to high-quality content, it forces sites to adapt to its ranking model so they have to produce meaningless crap.

👤 wood_spirit
It’s a general problem. In the past I have seen great HN threads sharing exclude lists etc to avoid the scrapers. Google could, of course, track a few thousand canonical sources so that just python and SO and a few others don’t get beaten to the top in their specific expert areas, but it doesn’t genrtalize. Google has no real algorithmic idea who is copying who.

And then it hits me: the proper documentation for things are on pages without ads! Perhaps that’s the signal google needs to start weighing heaviest…? ;)


👤 tayo42
the worst thing about those sites is they are slow.

I get why at least with the python docs, they're a little dense. some of those others have example uses which I could imagine people find useful.

geeksforgeeks with the login nag page is pretty bad.


👤 quietbritishjim
I've come across this many times and I've come to think it's an issue with that specific page in particular. I think the problem is partly on Python's end: they've got a single page where they've jammed in all of the built in types (str, dict, list, complex, ...). It's a huge list of types and a correspondingly huge page!

I suspect if there were separate pages for each type then it would be ranked higher... and it would actually be more useful. I don't get why they've done it like that.

The higher ranked pages admittedly have a whole page just for a single method, which is too far in the other extreme and is obviously more for SEO than use. But with the Python docs the way they are, we'll never know whether a more sensible official page would beat them or not.


👤 mabbo
As others have said, the pages with the most ads are the ones Google wants you to go to. Their highest priority is making money. Sometimes, that means showing you the right, best content, but a lot of the time now that's not the case.

Which is why LLMs have Google scared, in my view.

If an LLM has all the answers, you don't need to hand over your question to Google so it can steer you to the "right" (ad-filled) answers. It just knows, and tells you. Yes, hallucinations are still a problem but they aren't a growing one. LLMs that can provide you a reference to the right docs will be a thing soon if they aren't already.

How does Google make money in a world where fewer and fewer people need to ask them for where to find the answer?


👤 rabbits_2002
Those sites are plastered with Google ads so it makes Google more money to recommend them over a useful result.

Additionally Google guidelines for search ranking prioritize meaningless fluff and spam because they want to waste as much of your time as possible. More wasted time = more ad exposure.

The worlds largest search engine is owned by the worlds largest advertising company. I am surprised no one saw this coming lol


👤 jenscow
While not answering the question, prefixing the query with `site:docs.python.org` will get you what you need. If you're doing this a lot then I recommend at least adding a search shortcut to:

    https://www.google.com/search?q=site%3Adocs.python.org+%s
then you can type something like `py endswith` in the address bar. Or use ddg's "I'm feeling lucky" (prefix the query with !) and go directly to the first result:

    https://duckduckgo.com/?q=!+site%3Adocs.python.org+%s
Even better, just use https://devdocs.io/

👤 true_religion
[delayed]

👤 quest88
Checkout devdocs.io. I'd suggest using this if you want to search official docs. But first you need to enable whatever language you want to search. I enabled python 3.11 and searched `endswith` and immediately found the documentation.

I know this doesn't answer your question, but I hope this helps you in the interim.


👤 maxFlow
GeeksForGeeks is the bane of my existence and IMO a symptom that the entire system has perverse incentives. Fortunately, I've moved 90% of my code inquiries to LLMs, to great success.

👤 jhoechtl
Short answer: The rea Python documentation is paying less money to Google than the content farm.

Long answer: Balancing the many interests of search result parties, the decrease of consumer satisfaction is by Googles benchmarks outweigh by money received from their paying customers.

Use Bing, results are relevant and they do not yet rank paying farms as number one.


👤 morkalork
Because adding exceptions to an algorithm doesn't scale, they won't add any. Because customer service doesn't scale, they don't have any. And so on. Frustrating isn't it? It's what happens when you let engineers design a product. Letting system behaviour at the limits dictate everything.

👤 melx
Not an answer to OP question, but if you find yourself googling "python endswith" (as per example) then your DX is somehow broken.

Invest in good code editor with linting. No more googling for such trivial things


👤 __derek__
FWIW, DuckDuckGo gives me the same crap results[1], but at least the !bang syntax works.[2]

[1]: https://duckduckgo.com/?q=python+endswith

[2]: https://duckduckgo.com/?q=!python+endswith


👤 philomath_mn
Agreed that Google is not very helpful here, but searching the Python docs directly works pretty well:

https://docs.python.org/3/search.html?q=endswith

Otherwise I really like ChatGPT like this: you put in minimal work into the query and it usually fills in useful info. If you use "Advanced Data Analysis" mode it will run those examples in the browser.


👤 antegamisou
[delayed]

👤 melx
I wanted to start a new search engine that would index content from ad-free websites only.

Then I realised StackOverflow has ads nowadays and my search offering would be useless for like 98% of devs.


👤 pdntspa
This shit is so frustrating, every query I ever have for Rails stuff goes to APIdock

👤 mtkd
It's become materially worse for technical queries recently

Can recommend phind.com -- especially for obscure documentation/usage questions and followups


👤 walthamstow
I've completely stopped using Google for this kind of thing. ChatGPT or devdocs.io to go straight to the docs

👤 AlexanderTheGr8
Google search heavily focuses on the last page that users visit for a query. And python documentation is hard for most novice programmers to understand.

My guess would be that engineers first go to documentation, don't understand it, go to low-quality-content-farms which answer their questions in natural language. It's low quality but it's enough for novice use cases such as python endsWith.

And this leads to a big reduction in Google's ranking of python docs.

TLDR : most novice programmers don't/can't read docs.


👤 chomp
Kagi returns the expected link first, for what it’s worth.

👤 matt3210
I don’t see any ads on any of those mentioned pages…

👤 winddude
probably to do with realpython blocking content with the popup unless you sign in.

👤 babypuncher
We are to a point where I think Google should start manually de-ranking content mills and blogspam, or even stop indexing them altogether.

👤 somecommit
AI content farms are about to kill Google anyway

👤 freitzkriesler2
Because Google is run by business executives and clueless product managers. The engineers have been forced into the back to only think on whatever the current sprint is and told to buzz off when it comes to biznass decisions only those with MBAd can solve.

👤 yukinon
Hot take here, but as someone that doesn't code daily, I prefer those sites over the actual docs in most cases.

If I need to get something done quick, those sites will give me a quick 5 second refresher with clear examples.

Actually, in the doc you described as "obviously the correct hit", all I see is

> str.endswith(suffix[, start[, end]])

> Return True if the string ends with the specified suffix, otherwise return False. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.

Meanwhile, the first hit in Google for me is Programiz, which has actual real examples without any additional clicking around or trying to understand how the information is structured.

Besides, I know the docs exist, I don't need a google search for it. I'll click on the content farms every time because they've consistently been the fastest way for me to get what I need.


👤 cooperadymas
Every one of those "well-known low-quality content farms" is a better result for someone searching "python endswith" than the official documentation.

You don't have to parse through a veritable novel of irrelevant results to find what you're looking for.

They provide example code to show you how to use the method.

They break down the usage more thoroughly than the official docs.

They _show_ you the different parameters you could pass to the method.

Some of them provide interactive REPLs where you can play with and test the method.

The docs break it down _technically_ but they leave questions. Are start/end inclusive? What does it mean to "stop comparing at that position"? Why would you use the start parameter if you're trying to find the end of the string? If you use start does the end parameter count from 0 or from start? What happens if you pass a start or end that are outside the bounds of the string?

Look, I think the Python docs are great and use them all the time. But for the average person looking for info on `endswith` - whether that's someone new trying to understand how it works, or someone experienced looking to understand the parameter types - those pages are more approachable.


👤 wodenokoto
There are a lot of ways to look at it.

The "You're not the customer"-perspective: You as a user of google search is not the customer. The customer is the people placing ads on Google search, and secondary the people placing ads on the pages google search leads users to.

The "its an algorithm"-perspective: Google is a search engine, not a collection of curated links. In the past, Google has been very much against having human rate results, but I think they actual have focus groups that come in a lab and do some searches and rates what they see (under the guise of being a different search engine, most likely). Google is very conservative about adjusting their algorithm (or at least have been) and small changes can lead to huge changes in income.


👤 sheepybloke
Personally, it's because the docs aren't that good unless you're looking for something very specific. People are talking about the fact that there are no ads, but I would think it's because of the bounce rate. I could see there being a lot of people learning Python who go to the docs, see a large chunk of text with OK examples, and then bouncing to go to the other sites that have more examples. This is something that's happened to me a lot when I was doing Python development. Some of the stuff is very helpful on the docs (e.g. asyncio), but for other things the actual stuff that I want to do gets lost in the details of the docs. So while ads probably play a part, I think the bounce rate is a bigger factor, especially for people who aren't necessarily developers.

👤 TrevorFSmith
I highly recommend comparing google results with kagi.com results. The extreme difference in quality has a simple explanation: kagi.com is a paid service so it can downrank sites with many ads and tracking scripts.

Google needs that sweet surveillance money so its results are filled with crappy content farms both human- and LLM-generated. Kagi doesn't need to make money so it can happily link to the highest quality sites, even if they don't take part in the targeted advertising economy.


👤 tmporter
Your post inspired me to do some research on options for blocking these sites and I stumbled upon the uBlacklist browser add-on. It's open source, easily configurable, supports multiple search engines, and you can even use community block-lists instead of building your own.

👤 Apreche
The real question is, why are these content farm sites indexed at all? They are spam, and should be blocked, just like the way GMail blocks spam. They should never appear in any search result for anything ever, let alone be ranked first!

If someone simply took Google and just applied a huge blocklist so that garbage sites like those never got indexed, it would be the perfect search engine.


👤 gniv
Does Google even index local anchors separately? The result I see for that query is https://docs.python.org/3/library/stdtypes.html

👤 cratermoon
It's a genuine problem, not only with Python but with other languages.

However, the api library reference is only one kind of documentation, and not necessarily what everyone is looking for. For whatever language I'm working in, I keep the library docs handy for immediate use, and only go to search the web when I'm looking for something beyond a dry reference. Maybe I want a tutorial, or short how-to for a specific task. Maybe I'm looking for something deeper, with context and explanation.

I somewhat agree with another comment here: the library reference docs should be a keystroke or click away in your development environment. Are there plugins for your preferred editor or IDE to make this possible? Use those. If you're looking for a different kind of documentation and it's not part on the official python site, maybe that's something to be addressed.


👤 serjester
As someone that loves python and uses it daily, the official docs are terrible for getting a quick answer. They're impenetrable as a result of being exhaustive. If you google anything react related their official docs will be the first to show.

👤 sdfghswe
Because google is shit.

👤 bjclark13
I'm not sure I would consider those results "low-quality". Also, if you want to use the official documentation, just use the search bar of docs.python.org. Then Google gets none of your views!


👤 akagusu
Because content farms give Google money and the official documentation doesn't.

👤 rspoerri
i'd love a search page that lower the page rank if it contains ads. Maybe even a search that blocks all pages containing ads.

👤 drcongo
Google isn't here to help you find stuff, it's here to mine you for money.

👤 binary132
Call me crazy, but if I had to guess, most Python users would rather read the farm sites than use the official documentation.

👤 bjourne
Google doesn't index anchors (the parts after #) so the comparison is between https://docs.python.org/3/library/stdtypes.html and https://www.w3schools.com/python/ref_string_endswith.asp. The former page has a lot of content unrelated to "endswith" and thus is ranked lower. I also think that the "content farms" pages are useful because they offer up lots of examples which the official docs don't.

👤 PurpleRamen
> Obviously, https://docs.python.org/3/library/stdtypes.html#str.endswith is the correct best hit for that term.

Just because it's official, doesn't mean it's good documentation. The other results are significant better for this specific query. They are more elaborated, have better readability, offer examples, and don't force you to search through a long text to find the 3 lines which are relevant for you. Some even have a live-test.

The only real benefit python.org offers here is to offer more documentation about the language itself. Which is interesting for beginners, but not necessarily for everyone else.


👤 morgango
The text ...

> Return True if the string ends with the specified suffix, otherwise return False. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.

... without an example is NOT easier to use than an example for people learning Python. It uses language specific jargon (suffix, tuple), unexpected capitalization, and is needlessly terse.

How about ... > Check if a string ends with a certain ending (or endings) and return True if it does, or False if it doesn't. You can specify one ending or multiple endings in a tuple. You can also choose where to start and stop checking within the string.

.. along with an example of code that can be easily copied? That is the value the other sites provide. Readability and usability.


👤 rerdavies
Why exactly is MDN considered a more authoritative source than w3schools?

MDN would, I suppose, be a more authoritative source on what Mozilla thinks. And, presumably, a less authoritative source on what everyone else thinks.

The principle difference as far as I can see is that w3schools gives me the same information in 3 pages instead of 10.

My suspicion is that google cares less about what you think, and more about what everyone else thinks.