HACKER Q&A
📣 jgwil2

Google spam filter getting worse?


I have noticed an uptick in uncaught phishing messages in the past few months, and talked about it to a friend who observed the same. Anyone else?


  👤 tvanantwerp Accepted Answer ✓
For months now, emails with subjects like "MCAfeeconfirmati0n--#21845315" and "confirmation#4073301981" have been hitting my inbox. These are such obvious spam emails that I'm unsure how the spam filters aren't catching them. Reporting them as spam hasn't done anything to catch them.

👤 cochne
Google probably lets some amount of known-spam emails through for data gathering. See this quote from Google's "Rules of Machine Learning" [1] (A great resource by the way)

> Rule #34: In binary classification for filtering (such as spam detection or determining interesting emails), make small short-term sacrifices in performance for very clean data.

> In a filtering task, examples which are marked as negative are not shown to the user. Suppose you have a filter that blocks 75% of the negative examples at serving. You might be tempted to draw additional training data from the instances shown to users. For example, if a user marks an email as spam that your filter let through, you might want to learn from that.

> But this approach introduces sampling bias. You can gather cleaner data if instead during serving you label 1% of all traffic as "held out", and send all held out examples to the user. Now your filter is blocking at least 74% of the negative examples. These held out examples can become your training data.

> Note that if your filter is blocking 95% of the negative examples or more, this approach becomes less viable. Even so, if you wish to measure serving performance, you can make an even tinier sample (say 0.1% or 0.001%). Ten thousand examples is enough to estimate performance quite accurately.

[1] https://developers.google.com/machine-learning/guides/rules-...


👤 mtmail
Gmail is the prime target for all spammers. I see regular reports, also for Google Search results. Nobody has an answer really.

3 days ago "Tell HN: Gmail's spam filters have gone bonkers" https://news.ycombinator.com/item?id=34411009

1 month go "Ask HN: Do you all get spam in Gmail daily?" https://news.ycombinator.com/item?id=34093812

4 month ago "Ask HN: What's happening with Gmail spam filtering?" https://news.ycombinator.com/item?id=32923098

"Ask HN: Is Gmail spam out of control for everyone else too?" https://news.ycombinator.com/item?id=30315116


👤 nanidin
I run my own mail server + spam filter, so I'll chime in. I have seen a high uptick in spam making it to my inbox in the last two weeks. I primarily rely on Spamhaus blocklists + a Bayesian filter trained on old spam.

The uptick I have seen is going from 0-2 spams making it to my inbox to 10-20 spams making it to my inbox. When this has happened in the past, I have assumed it is spammers bypassing blocklists by finding new hosts, or by spammers finding a clever way to beat the filter. Usually after these big upticks, they drop off again suddenly, which makes me believe that it was a blocklist bypass and not a filter bypass (my filter is pretty weak and hasn't been retrained/updated in many years.)


👤 andrewmcwatters
Yes, it's been measurably worse for somewhere on the order of months to years now.

I'm not sure what they've changed internally, because if they have talked about their engineering strategy for spam detection (which I doubt, since it's probably asymmetric information), no one has shared writings about it.

Nevertheless, I get obvious spam in my inbox now, and important email occasionally goes straight to my spam filter now.

People here on HN have been speculating that they moved to some sort of machine learning model, probably because employees were incentivized to pervert the existing product for promotion purposes by gaming internal metrics to prove they've had an impact.


👤 maicro
Another anecdotal datapoint, but - I haven't noticed an uptick in actual spam making it to my primary inbox. I can't give solid numbers, but it's not been bad.

This includes a marked increase in crypto spam/phishing emails due to the cointracker email list breach - those have pretty much exclusively gone straight to Spam (including those using Google Sheets so it has an official Google sender email).

Again, just an anecdote, and I don't doubt that you and anyone else reporting an increase is experiencing it.


👤 georgel
I have a month old business email for my new company setup with GSuite and Google's own on-boarding emails went directly to spam in that inbox. I haven't marked any emails as spam with this new account yet.

👤 DownGoat
I have been getting tons of PDFs which in the previews shows pictures of women. The subject and body of the emails just seems to be random words like in a seed phrase, and with some random single digit numbers. The email is sent from office, hotmail or gmail accounts and verifies. The TO field is also filled with other emails. I have been getting this for like 3 or 4 months, and report as spam does not work. In all the years I have had a gmail account it has never really been a problem.

👤 svdr
We have the opposite problem. We send lots of newsletters (no spam, mostly government) and have an excellent IP reputation (Senderscore, Microsoft), except for Gmail, where our IP reputation has declined in recent months, for no apparent reason.

I hope they are just tweaking things!


👤 jeffbee
I'll try to address the specific question that seems to have been asked, which is about phishing. Phishing and spam are two different classes. Spam is largely classified based on metadata about the transaction and only to a lesser extent the body of the message. Phishing, on the other hand, is almost purely based on the content, because it revolves around stuff like the message seems to attempt to confuse the recipient about the sender's identity, or includes URLs that appear to be intentionally confusing, or is using domain names that seem to have been intentionally formed to mimic your organization's domains (for Workspace customers). So you are going to see very different outcomes for spam and for phishing, and quite different outcomes for gmail.com accounts vs. Workspace accounts.

👤 deviantbit
Yes. Google has loosened their spam filters. I have noticed.

My educated guess on why? Lawsuits from political parties, notification of class action litigation against Google and others, union notifications, insurance notifications, and similar emails ending up being caught by spam filters.

The lawsuits are piling up.


👤 GistNoesis
Not exactly spam, but quite often mail are badly sorted and promotional mail get into the main inbox. One of the main offender is aliexpress. They send everyday some mails from various addresses : buyer01.m@mail.aliexpress.com services01@aliexpress.com exclusive01@mail.aliexpress.com ae.like18@mail.aliexpress.com buyer-info18.m@mail.aliexpress.com

And every month or so they vary the numbers and I have to tell the filters to route them appropriately to the junk folder. (And I have to tell one mail at a time because if you try to select multiple with different mail addresses the filter doesn't propose to add it to the filter list).


👤 callumprentice
I've used Gmail since 2003 and consequently was (un)lucky enough to get my $FIRSTNAME@gmail.com - it's certainly handy but boy do I get a lot of spam - 3-400 hundred a day I expect.

I've definitely noticed an uptick recently and what is most perplexing is that some seem like they'd be easy to catch - in fact, I set up some Gmail filters to do so and they seem to be working 100%.


👤 lefstathiou
It absolutely is. A few weeks ago I decided to create what are now dozens of rules to manually filter out spam and it has been extremely effective for me (20+ a day we’re hitting my work inbox).

My best filters target the “opt out / unsubscribe” language people put in their footers. I iterate a few times a week as things sneak in. I’ll never get 100% but the results have been very positive.


👤 runnerup
I get dozens of terribly-formatted spam emails per day. I get about 2x-3x as many "spam" emails as I do real/marketing emails. "Spam" is in quotes because the emails make no sense...I can't see anything they're even selling. There will just be one non-sensical link by itself that I never click.

Example:

from: runnerup info_GBAQBHFLXV@news.ukgkkwwumjhqu.edu via netorg12672764.onmicrosoft.com

subject: --confrmtion-70346102

content: a link labeled: "runnerup-N0tificati0n"

I can't tell if Gmail can't figure out these are all spam, or if they purposely send me 1-2 dozen every day because I religiously label them all as spam and no one else does. Maybe that's because its super hard to label an email as "spam" using their mobile app! My wife has also noted a large uptick in similar spam emails over the past 6 months.


👤 ohyoutravel
I was just about to ask this on here! I regularly check junk mail just in case and it’s been crickets for a long time, but in the last couple months seem to get like 3-4 spam emails in there a day, and regularly into my inbox, usually a Geek Squad or McAfee “purchase” receipt. Very clearly spam.

👤 sp332
Don't forget to check your spam folder for ham as well!

👤 autotune
I noticed this as well, switching to kind of a relatively new service called Tutanota as I haven’t heard great things about fastmail and protonmail when it comes to spam and looking for something using open source tooling. We’ll see how it goes.

👤 fuzzythinker
Yes, I got the blatantly obvious but still scary coinbase one:

The subject says "paid" -- "Reminder - You have paid an invoice"

but the email says to pay it.

"Please pay your invoice

Coinbase would like to remind you to pay invoice xyz.

Amount due: $599.00 USD"

With sender email being paypal.


👤 jedberg
I think it's a combo of two things:

1) To get the best training data, you sometimes need to let things you've classified as spam into the inbox to verify that the user marks it as spam. It's pretty standard for training a classification system to occasionally pass negative samples to verify their negativity.

2) The spam filter itself almost certainly has a latency budget, and if it can't respond in time, the message is passed unfiltered. In other words I think the spam filter fails open. It's probably just been down more lately.


👤 jiripospisil
Gmail's spam filter is still a dream compared to Outlook (I use both regularly). Absolute lazy garbage gets through to my inbox and has for years [0]. I used to complain about it to their support but nothing has changed. I certainly wouldn't recommend Outlook to a non-tech-savvy person.

[0] https://twitter.com/JiriPospisil/status/1108355909099667462


👤 thewebcount
I think there's something bigger going on. At work we have non-Google accounts and it's gotten worse there as well. Like all of a sudden about 3 or 4 months ago, I just started getting bunches of very very obvious spam that wasn't caught by our previously pretty good filters. (Subjects like "Meet Russian Brides" and From fields of "Foo Print Advertisement", etc.) I wonder if someone has figured out how to game the current system or something?

👤 bluedino
I get 10-20 a day.

Lately it's been Google classroom invitations from sex bots. Along with the random crap that doesn't make any sense, and the McAfee/Yeti Cooler junk.


👤 paxys
Spam filtering is a cat and mouse game. The moment you think you have the "perfect" set of rules, scammers will figure out how to game them. Then you'll have to make changes to handle the additional cases. Rinse and repeat.

I have anecdotally seen slightly more types of scam/phishing messages slip through the filter in recent weeks, but I assume it'll go away in the next round of updates from Google's side.


👤 gwbas1c
I get a significant amount of recruiter spam: Every day I get 2-4 cold emails from random recruiters that are generally poor matches. (IE, the recruiter never read my resume, is probably sending email to thousands of people, ect, ect.)

I always mark these as SPAM, and the next day more recruiter spam comes in. There's no way to unsubscribe because they always are from some random independent company.


👤 Thoreandan
Yes, I've been seeing more in GMail, but that's nothing compared to Google Photos spam and Google Calendar spam, which I get hit with every other day.

To maximize the ridiculousness, Google sends me an email thanking me for each image abuse report or chat abuse report done in Photos -- but they don't seem to be actually /doing/ anything about it.


👤 bryan0
Has Google ever publicly talked about their spam performance filter over time? For me this past year I get obvious spam messages in my inbox every week. Is it that they can longer filter at the required scale? It seems hard to believe these messages could evade even the most rudimentary filters, so I assume they're not being filtered at all.

👤 xnx
The spam arms race continues to escalate. Broad availability of tools like ChatGPT has probably helped spammers in the short term.

If any good can come of this long term, it would be the ability for me to charge people to get an email into my inbox. This has been proposed multiple times over the decades, but has never been more needed or feasible than now.


👤 Mattasher
I'm having the opposite problem. Sometimes even my replies to someone with a Gmail address go to their SPAM box. What kind of a filter decides you don't want to see a message from some you messaged first?

FWIW I have my own domain and switched to Google as backend long ago, and yet I still occasionally have this problem.


👤 davidw
There are several Google Groups that I subscribe to and this regularly happens:

A real person who I know in real life, whose messages I care about posts to Google Group from a Gmail account, and the message ends up in my own Gmail spam filter.

Like - the message didn't even leave the Google infrastructure and it got tagged as spam?!


👤 tclancy
Try to treat it like weather. Some times things are clear for weeks, then you get hit with storms. My wife and I both have had Gmail accounts forever and we never see the onrushes of spam at the same time. So I think it's the noise of two algorithms fighting. We should all get used to it.

👤 charcircuit
I've been getting phishingesque emails from domains that are just random characters .ml (free domains)

👤 eisolo
I've experienced kind of the inverse of that lately -- using Workspace (and their domains) for email and regular outgoing emails are ending up in receiver's spam box. SPF/DKIM/DMARC/etc all setup correctly, tested (and working fine for many years).

👤 insanitybit
I'm determining how to migrate off of gmail. My inbox has been destroyed and I can no longer use it reliably, it's impacting my personal life. Spam comes in every day and no amount of "mark as spam" can save me apparently.

👤 phendrenad2
Phishing emails in particular seem to go right through most spam filters. It seems like email providers should be focusing on these (spam emails don't annoy me as much as an email carefully crafted to steal my identity!)

👤 thearn4
I absolutely had this issue most of last year, but the Gmail spam filter seems to be catching things more effectively for me in the last two months.

👤 kmfrk
There has definitely been some more getting through the filters the past few weeks for me. Maybe January is a special month for spam or something.

👤 brianjking
The main issue is so many scammers are signing messages with DKIM/SPF too.

👤 cowvin
Yeah, close to the November election, it felt like Google stopped filtering election spam email suddenly.

👤 rryan
In the past few months I've been getting much more fake order / payment spam.

👤 hi5eyes
Google needs to do something about google forms/drive spam and other bypasses

👤 dekhn
Yes, it's getting worse. I get and mark as spam the same email pattern over and over.

TL;DR: almost all the people who care about quality at Google are gone or not in a position to improve the product


👤 Kukumber
don't use your main email to register in random websites

that's as simple as that


👤 greggman3
How about reframing to "Is spam getting harder to filter?"

No one at google wants spam.