I have been thinking about the problems of news curation lately and particularly about how Google News is only possible because it is able to recognize clusters of related stories. I mean, anything that gets in the New York Times get parroted by 300 other newspapers in 24 hours, and if you didn't fight this the news feed would be spammed by hundreds of copies of the same article. So generating a good news feed requires identifying relevant clusters and then selecting the best articles about other clusters.
I always thought Google had something specialized to the news problem and that in particular was built with the temporal structure of the problem in mind but I went looking in the literature and didn't see anything on use of temporal structure in news filtering.
I am working on something now that is going to use conventional clustering algorithms for text (maybe based dimensional reduction on B.O.W., maybe based on one of these new-fangled neural vectors) and seeing if there is some easy way to adapt it to the temporal setting. Running a monthly batch job would make it closer to the conventional clustering case than the continuous "process a set of RSS feeds" approach I was thinking about.
https://en.wikipedia.org/wiki/Portal:Current_events/December...
If anyone knows how I could get this emailed at the end of the day/week/month I would be very interested as well
Unfortunately, these services are not advertised much, but usually, they could being found in lists of products of top information companies.