With the technology, amount of data, and algorithms we have currently should be pretty easy to build a tool capable of tracking articles/papers/talks you read/like and suggest things you could be interested in.
Does anybody use/know a tool like that? There are many places to get suggestions based on how popular they are across the entire platform but aren't we in the era of the personalization?
I wish there was something similar for the general literature, it only indexes like the ML and vision related communities
For papers, mostly through social sharing I think. HN, reddit, and slack communities. For videos, I subscribe to a few youtube channels, like PapersWeLove. Reddit has /r/contalks as well for a broader list.
The rest is second order effects. If in the course of reading an article, I notice the author because I've seen another article from them I like, I'll subscribe to them. Bloggers have RSS feeds, and even researchers like Cormac Herley can be followed via Google Scholar. Or if a conference seems to produce a lot of interesting research, I'll add it to my calendar for next year to review next year's crop of papers. And the stuff I read typically has references, so if a subject interests me, I'll typically glance at the citations as I encounter them and put interesting ones in my backlog.
> a tool capable of tracking articles/papers/talks you read/like and suggest things you could be interested in. Does anybody use/know a tool like that?
Youtube does this. I get recommendations for USENIX talks that are fairly newly uploaded, because ive watched a few in the past. This is actually useful since sometimes USENIX uploads videos years after the conference; I seem to be getting a few recs from conference recordings taken years ago but uploaded this week.
I'd imagine Mendeley and other citation managers could also help here, but it's pretty niche and as a not-professional researcher, adopting these for professional use is far down my backlog.
Oh, or you can browse conference proceedings of the top conferences (in CS, at least.) (Though those are also, gigantic, and you probably want to filter even further...)
I feel that Google Scholar Alerts are only useful once you can filter paper titles / taglines by yourself, which requires tremendous expertise. I would be very surprised if any automated tool could replace other people and their technical expertise (which took years of training to develop) as a paper filtering tool. Otherwise you might as well automate peer review.
Last.fm used you've got X in your library/listened to X, other's into X also liked Y algorithm to suggest and stream a mix of music you might like. It was sufficiently good that in 2012, they had to admit that 43 million of their accounts had been compromised when they were hacked. Their original business model eventually failed, and in 2014 they ended up removing streaming. And with that, it effectively became a music version of Good Reads. Which, although it is still going, killed it for most people.
Anyway, there's quite a bit of quality writing about Last.fm and its demise, e.g. https://www.washingtonpost.com/news/arts-and-entertainment/w... , https://www.vice.com/en/article/a37x9g/lastfm-was-the-only-m.... For my money, Last.fm must be about one of the best case studies into what works well and what does not. I would have a read if I were thinking about something similar.
I think it may strike the balance you are looking for between: 1. following popular content, 2. subscribing to individual sources and 3. getting personalized algorithmic recommendations.
From the Show HN post:
"LinkLonk is a novel mechanism to subscribe to RSS feeds and discover content - upvote or submit a link to anything you liked and you will get connected to RSS feeds that posted this content. The more content you upvote from the same feed - the higher other content from that feed will show up in the For You page. This helps you see content from feeds with the highest signal-to-noise ratio first.
In addition to RSS feeds, you connect to other users who upvoted the same content as you. This way other users help highlight great content from feeds you are already connected to and discover other feeds that they are connected to.
To sum up: upvote content => connect to RSS feeds and users => discover great new content => repeat."
1. Links posted on HN
2. Links posted in various sub-reddits (/r/machinelearning, /r/semanticweb, /r/artificial, etc.)
3. Links posted in a variety of Facebook groups I follow
4. Links posted on Twitter by people I follow
5. Google Scholar alerts via email
6. Ones located as references or "external links" on Wikipedia pages
7. Manual searches or browsing of arxiv.org, jmlr, aclweb, aaai website, etc.
8. Google / Google Scholar searches for certain keywords or phrases
9. Ones mentioned in books or cited in other papers
I just follow HN and a couple of other places that provide good content.
This is a complete solution for web and mobile, aggregating articles, news and papers from 1000ds of sites across the web.
You will find:
• Latest news, articles, and research in AI, ML, DL, NLP, IoT, Quantum, Web, Mobile, careers...
• Curated news feeds for many AI/computing related topics
• Keywords search is central, personal lists of topics, sites, and rss feeds
• Preview or listen to summaries, save and share links
• Newsletters for personal keywords and sites
• Only articles from the last 2 days
https://datastation.multiprocess.io/blog/2021-07-12-this-wee...
The main problem for me often isn't finding interesting stuff, but papers that I can read for free.
For example when libraries closed their new book shelves for a years due to covid fears, I started relying on NYTimes Sunday Book Reviews of non-fiction for good choices.
I start with a question. Mine initially was "What are the Big Problems?" I then dive recursively into that.
Starting reading at virtually any point, you'll discover that knowledge is a web, and that the best authorities reference others. Then the spidering begins.
Follow an author's references. If you find a quip or fact or quote or reference which seems especially germain, then look it up. Given today's Internet, this oftem means you can have a specific reference in front of you in seconds. I had this experience re-reading James Burke's Connections a few years ago, in which he mentioned Agricola's De Re Metallica (which is not about the band), and found that the English translation of this 16th century work (not completed until the early 20th century) was at the Internet Archive. (The translators also have interesting biographies.)
If you find an especially good source, look to see what other sources reference it. That is, look for citations. This is slightly less powerful than the first method, but 1) serves as a check to see what works are truly significant (they'll have high citation counts) and 2) will lead to more current treatments of a concept (which aren't always better or improvments, mind).
Those are the two principle methods.
Once I find an author or topic (subject heading or keywords) of interest, I'll use a traditional catalogue, almost always Worldcat, to look for additional materials. If you find an author of interest, this is a good way of finding their other works. Worldcat indexes both books and articles.
https://worldcat.org/ DDG bang search !worldcat 'au:' == author, 'ti:' == title, 'kw:' == keyword
I don't have a good catalogue for popular magazine or newspaper articles, though there are several commercial options. Some libraries (public, community college) will provide access to these. Google Books captures some of this material, at least for searching.
Google Scholar, Archive.org, Open Library, LibGen, and ZLib are also useful for both searching and sourcing documents.
General Web Search has become all but useless over the past 5 years or so.
Finding an idea, especially one that seems to be universally accepted and unquestioned, and seeking out its source can be profoundly interesting. Google's Ngram Viewer is your principle tool here, as you can see specifically when a specific word or phrase (up to five words) emerges. Quite often "accepted wisdom" is found to emerge with very little empirical foundation. It can be tricky to identify where the breakout occurs and through what work, but this approach seems to work better than others.
Online sources are another option, though what I increasingly find is that more-recent online content tends strongly toward lower value, and less use of these is better. This depends greatly on the field. Among the best options is to not read the current submissions, but to do a specific search for top items within some time bound.
On HN, you can effectively see the top submissions from the past week, month, or year. I've addressed that here:
https://news.ycombinator.com/item?id=28806795
Other sites, notably Reddit, have similar date-bounded search options. Incidentally, if you're assessing whether or not a subreddit is worth subscribing to, reviewing its top posts by week / month / year is useful.
In general, I find that identifying a good author or publication, and "stalking" their output, is superior to virtually any user-generated content site (FB, Reddit, Twitter, HN, etc.).
Books and articles have higher hurdles to publication than online articles do. The Internet's editorlessness is becoming more of an obstacle than a benefit as there is simply so much crap online.
Track your references. Zotero seems to be the gold standard here, though I don't use it myself. Calibre has its uses. Avoid Mendelay like the plauge it is.
Consider a Zettelkasten or equivalent. I'm referring to pen-on-paper index cards as the most robust option here, though there are digital versions. Of these, I'd strongly recommend Emacs org-mode or a flatfile ASCII / UTF-8 reference as the most robust, possibly a wiki. The simpler and more robust this is, the better, as it will quite possibly last your entire life. The problem with hot new software is it often does not.
I strongly recommend an e-book reader or tablet with the absolute most onboard storage you can manage. I'm pretty happy with the Onyx BOOX line, and have their largest device, the 13.3" Max Lumi. It's been updated recently to 128 GB onboard storage (mine is 64 GB, and I'm bumping up into that), and I'd prefer that were bumped to 1 TB (some Apple iPads reach this).
I strongly prefer e-ink to emissive displays.
For size, 6" is about the minimum size you should consider, 8" is comfortable for most straight text or e-pubs, 10--13" is much better for scanned-in PDFs of older works and articles in particular. (That was my thinking in buying the Max Lumi, and it's largely been validated.)
My usual problem isn't to little to read, but far too much, and setting (and sticking to) priorities on that.
Otherwise I do searches using Publish or Perish and collect the the papers into Zotero using Sci-Hub.