HACKER Q&A
📣 datapollution

What could be the downside of polluting the data collected by companies?


Time and again we have seen that companies collecting data of users who have never agreed to their ToS by creating ghost profiles.

I have been trying to prevent data collection on my family by blocking ads, trackers and fingerprinting methods but so far I can tell that it has become a game of cat and mouse. I still see personalized ads and the entire endeavor has become futile.

Now, I am thinking of running a bot that mimics a human browsing but with bogus topics. For instance, open a random Wikipedia and use Google to search 5 to 10 words in the Wikipedia page on Google and open links from the results arbitrarily and clicks ads on those links with a given probability. And do this throughout the day with certain interval.

Or, even go as far training a AI to observe my browsing habit and mimic it. But, before I go that route, can anyone suggest problem problems and improvements on this experiment. I was wondering what downsides such a thing could have? Would it help in anyway? Thank you.


  👤 keiferski Accepted Answer ✓
This isn’t my field, but: you might simply see further developments in determining whether users are ‘real’ or not, resulting in even more data collection and tracking. However this is probably only likely if a significant percentage of the data was known to be fake.

👤 smoyer
I've wondered the same thing ... I also have a small side project to create realistic looking data dumps that can be put in unprotected S3 buckets with the idea that collecting, verifying and selling card data would become harder if only 1% of the data was real.