HACKER Q&A
📣 eigenvalue

What can Google realistically do to fix AI-driven SEO tricks?


This tweet has been making the rounds recently, with most bemoaning how things like this have quickly made search almost useless now:

https://twitter.com/jakezward/status/1728032639402037610

But spammers are going to spam, and as long as there is a strong financial incentive to game the system, it's going to happen more and more. So assuming we are stuck with this kind of behavior, what can Google pragmatically put into place to minimize the impact on organic search ranking?

Some ideas off the top of my head:

- Specifically look to detect this sort of thing-- a new website popping up out of nowhere with a huge number of never before seen articles. The articles closely mirror the sitemap of another currently high-ranking website. Analysis of the articles suggests high probability of AI generation (this part is tough obviously). All of this directly leads to banishment from the first 100 pages of results at a minimum.

- Give massive preference to pages that have been up continuously for at least 2-3 years.

- If you can't find any organic links to the site on places like HN/Twitter/Reddit (where the comments/posts containing those links have at least a few upvotes), then it's probably not a "real" site.

- Boosting rankings for content that is clearly attributable to a real person whose existence can be verified in various ways (the same way you might look up who the poster of an HN comment is IRL).

I feel like this is absolutely a solvable problem if they take quick, decisive action. At the very least, they could try to minimize the impact. Otherwise search really will become useless and we will lose a very powerful tool.


  👤 PaulHoule Accepted Answer ✓
My impression is that the web died in the 2010s and now it is fashionable to notice now that A.I. seems to be a thing. (e.g. Google was skimming off the all the money in the web ecosystem for a long time for instance and now all of a sudden people are mad that A.I. is going to write bullshit articles that thoroughly scramble everything they've written) I would point to this guy's blog

http://www.seobook.com/blog

where he was at about the same level of despair you are at 10 years ago. Recent posts reveal a lot of information about ranking factors in modern search engines, just how bad the problem is from a technical perspective (like link age being useful but pernicious in its own way) and a social perspective (if Google's results were good, why would you ever click on an ad?)

To harp on link age there is a tough balance between two factors:

1) Content that has stood the test of time is likely to be good, and

2) Content that is fresh is likely to be up to date

If we really privilege old sites decisively then we entrench a set of old sites whose owners don't have a reason to keep them up to date. Even if this policy is good in the long term, it would be disaster in the long term because it means there is no reason to start new sites, old sites don't have competition, and the web gets worse.


👤 bradley13
You're likely right. If not solvable, Google could still do a lot better.

Anecdote: just today, I searched for a restaurant by named and town. The first several resultscwere sites like TripAdvisor. The restaurant's own site was far down the list. Why? Bet: Google makes good money off TripAdvisor&Co, because they also advertise, but probably none from the restaurant.

Google has allowed their ad business to drive their priorities. Everything else is secondary.


👤 CM30
For the given case, look for large amounts of added content that just happen to match up with a competitor's. Very few legit sites will add 1000+ articles/pages in a short amount of time, especially not ones that don't relate to say, product pages, category pages or other normally automated setups.

And honestly, looking for ridiculous amounts of pages being added in short periods of time should be one of the first things Google flags for a website. Ignoring a few very unlikely exceptions (site is bought out and merged into another one, site goes viral to the same degree as TikTok, Amazon sets up another store), adding thousands or tens of thousands of pages in weeks, days or hours should at least get a site flagged for manual review.

Similarly, sites with a large percentage of content copied from elsewhere (like those Stackoverflow ripoffs) should be easily downranked into oblivion. Forget AI, one of the biggest problems Google has right now is that obvious copycats are ranking as high as (or even higher than) the originals, despite the company clearly having numerous ways of knowing who the original is and who the ripoff is.

Also perhaps treat different types of sites differently too. For example, often when a wiki is forked, you'll have two sites with similar content; the original and fork. Prioritise the one with more edits and activity, and you'll see things like Wikia/Fandom get rightly wiped out SEO wise.

And yeah, definitely use social media as more of a benchmark for legitimacy, since virtually everyone uses it in some way to communicate online. If a site isn't mentioned by anyone who's been around for a few years or with a decent reputation, then don't rank it as high as one that said people are mentioning. Stuff like Hacker News and Reddit karma, likes/retweets/their equivalents, mentions on blogs and personal sites, links from wikis, etc should be seen as authenticity factors in a much more strict way than they are at the moment.


👤 carlosjobim
> - If you can't find any organic links to the site on places like HN/Twitter/Reddit (where the comments/posts containing those links have at least a few upvotes), then it's probably not a "real" site.

Hard disagree. That makes it impossible for newcomers to rank, because not everything is of interest to the social media crowd. If you're opening a new bakery, nobody will be talking about it on HN, Twitter or Reddit, even if you have the best stuff in town.

Why do you occupy yourself with trying to solve a problem for a trillion dollar company that could easily solve it themselves? Why do you care about them? They sure don't care about you even the slightest imaginable.

If you want to use a good search engine that tries to fight SEO spam, then you have Kagi as a good option already.


👤 solardev
You know, isn't it possible that the AIs might actually be better at rewriting human written SEO spam into better articles, even if they steal the content or ideas? Should "written by a biological person" really be the dominant factor for results ranking?

Ideally we'd just get to the point where search results are themselves unnecessary because an AI can summarize the results, distinguish the spam from the good stuff on its own, and present original research without hallucination, citing prior sources (human or otherwise).

The spam and lists/summaries problems might actually get better once it's just AI vs AI and there's no human in the loop and nobody putting money into SEO spam anymore because they can't outcompete the AI writers. Humans and AI can still both provide new material not yet in the training sets, but we can get rid of an entire industry of junk writing?


👤 i_have_an_idea
I doubt the approach described in the Twitter thread actually works for various reasons.

However, for a while now, the name of the game in SEO has been authority.

Do the things that help establish your authority and you will rank well. No matter what AI or SEO tricks other sites use.


👤 Kelteseth
IMHO I don't think Google can. The mix of GPT4 and the ever worsening Google search that happen, even before the whole openAI revolution, pretty much killed Google for me. For the rest I use DDG.

👤 smoldesu
Does anyone have examples of particularly "ruined" searches? I keep hearing about these examples but I've never encountered them any more than the SEO-optimized garbage blogspam of the 2010s.