Likewise, "peer review" typically just means that 3 academics read the article and thought it was interesting and worth publishing. It's important to recognize that a reviewer accepting an article for publication does not imply that the reviewer endorses any conclusions or arguments made by the article.
It's usually not a great idea for laymen to be reading academic journal articles, and it's especially not a great idea for the news media to report them. It's not that academia is trying to "hide" things from the public; rather, the academic journals are simply not writing for a public audience, otherwise they'd write differently.
Sadly, there's also the "publish or perish" factor: as an academic, you have to publish something to justify your existence in academia, get tenure, etc. So there's a lot of material that in an ideal world might be considered filler.
And peer review does not get to the question of whether things are "legit". Likely you are quite aware of problem cases that come up in the news every once in a while. Bear in mind that those problem cases were peer reviewed, often through many, many stages, and nobody twigged to the problem.
Perhaps the answer to "what" question is: get an advanced degree, work for a decade or so in a particular sub-field, attend conferences to know what is being done at the cutting edge, spend a lot of time reading, and use all the tools at your disposal to evaluate the work.
In a nutshell, I'd say that non-experts should not expect to be able to evaluate the legitimacy of scientific papers and experiments. And even a highly-skilled expert in one sub-field is a non-expert in another.
There is no simple N-step program to yield what seems to be sought in your question.
1. Read the abstract super carefully. Typically nothing in a paper that’s not in the abstract should be considered as absolutely confirmed by that paper. 2. If you’re new to the field, you can read the introduction. Ideally you shouldn’t need to. 3. Directly skip to the FIGURES. You should not read the results section! That is textual interpretation of the results from the authors’ perspective. If the paper is half decent you shouldn’t need to, everything the paper tries to tell should be fully substantiated and synthesiZable from just the figures and the legend. Especially biology. 4. Most definitely absolutely skip the discussion, it’s literally what the authors want this paper to be and should not affect your judgement.
The most interesting part is this is hard for a beginner or someone outside the field. But I think this is a good yardstick - if you can’t impute the results from just looking at the figures, the combination of you and the paper is not sufficient for you to fully evaluate that scientific finding to any degree of certainty.
There's always obvious things to catch: do all of the data points plotted on this graph make sense? Does a predictive model fall back to chance performance when it should? Should the authors need to z-score in order to see this effect? Other labs typically use X statistical model here: why are these people using something different, and how should I expect the results to change? If my brain is having a difficult time understanding how they went from raw data to a given plot, that's usually an indication that something's fishy.
Then you need to check for consistency between the results. Figures should come together to tell a coherent story. If there are small discrepancies in what Figure 1 and Figure 3 are telling you, investigate. It could be that the authors are hiding something.
Then there's sociological factors: how much do I trust the work that this group puts out? Being in higher-prestige journals is actually an indication that you should be more skeptical of the results, not less. There's more riding on the results telling a neat and coherent story, and more space to hide things in (given that you typically need to do 2 or 3 papers worth of experiments to publish in Nature these days).
Unfortunately, there's a great deal of bullshit published every day. Every paper has a little bit of it. The trick is to get great at spotting it and getting something out of the paper without getting discouaged with science as a whole.
1) Read the abstract, to see if it's something you're interested in and willing to spend some time one. Reading the introduction will often help here as well.
2) Read the materials and methods section and attempt to reconstruct in your head or on a piece of paper exactly what the structure of the research was. If you're not familiar at least a little with the field this may be a rather difficult exercise, particularly if they use techniques you've never heard of. However, without going through this, you can't really do any critical evaluation of the research. This can be the most difficult part; understanding the techniques (and their various pitfalls etc.) may require reading other papers cited in the bibliography.
3) Read the results, and in particular, focus on the figures, which should correlate with the materials and methods section fairly well. Here look for things like error bars, statistics, clarity, etc. Poorly written papers tend to obfuscate more than enlighten. There should be good correspondence between materials and methods and results.
4) Finally, read the discussion. Overblown claims that don't seem supported by the results are not that uncommon, so retain a healthy sense of skepticism. A paper that discusses uncertainties in the results in an upfront manner is typically more reliable.
Thorougly evaluating a paper in this manner is quite a bit of work, and can take hours of effort, even by someone with actual research experience in the field.
Usually, if it was a field that you are not that familiar with (say as an undergraduate student or a new graduate student), you'd start not with research reports, but with published reviews of relatively recent work, which will give you an idea of what's mostly settled and where the controversies are. If you were completely starting from scratch, a well-regarded textbook could be a better option as an introduction.
Besides some of the suggestions already presented, here are some additional elements I look for:
1) Do the references actually align to there claims they back up? Surprisingly, you can find a lot of reference fodder that is misused
2) Does the validation make sense or is it gamed? There are times the article method uses one model but slightly tweaks it during validation. Examples being dropping troublesome data or using a different classification scheme. Does the model actual compare to and improve on current state-of-the-art practice?
3) Do the authors offer their data up for review?
Almost all papers will overstate their own importance in the introduction, because you have to do that to get published. It takes context of the field to know for e.g. “oh, this guy is just running random simulations and picking out odd results and publishing them” vs “this person has done a comprehensive set of simulations by doing realistic parameter sweeps and has published the whole results set”.
When reporters write stories, they often just take press releases that groups put out themselves. Some groups sell themselves better than others. In my old field, one group in France looks “dominant” if you look at the news, but in reality they’re one of many that do good work, the others just have less press around it because they’re not working at such press hungry organisations.
1. What are the actual claims made by the authors? That is, what do they claim to have found in the study? If you cannot find these, there is a good chance the paper is not particularly useful.
2. For each claim, what were the experiments that led to that decision? Do those logically make sense?
3. Are the data available, either raw or processed? Try reproducing one of their more critical figures. Pay close attention for any jumps in series (e.g. time suddenly goes backwards -> did they splice multiple runs?), dropped data points, or fitting ranges. If the data are not available, consider asking for them. If the code exists, read through it. Does the code do anything weird which was not mentioned in the paper?
4. How do they validate their methods? Do they perform the appropriate control experiments, randomization, blinding, etc? If the methods are so common that validation is understood to be performed (e.g. blanking a UV-Vis spectrum), look at their data to find artifacts that would arise due to improper validation (e.g. a UV-Vis spectrum that is negative).
5. Do they have a clear separation of train and test / exploration and validation phases of their project? If there is no clear attempt to validate the hypothesis for new samples, there is a good chance the idea does not transfer.
If you mean "legit" as in a high quality work that you can trust the conclusions, the things that I see people get wrong most of the time are:
- Make sure what you understand the conclusion is is exactly the same thing the paper concludes. That is the one I see people doing wrong most often, if you are not an academic, usually papers don't say what you think they say. One of the things to do here is taking your list of fallacies, and looking if you have fallen for any of them.
- Make sure the paper's conclusion is supported by its body. Yep, once in a while the above problem affects the scientists themselves, not only outsiders. And peer reviewers are not immune to it either.
- Take into account the paper's (predictive or explanatory) power. For more complex experiments, it's summarized as the p-value. Keep in mind that there may be many papers out there not published because they got the boring result, so the odds that you are looking at an statistical oddity is higher than it seems. It's usually good to ignore a single paper with low p-values (like 0.95, but if it's a popular research topic, maybe even 0.995) and wait for a trend to appear between papers. But also here, try to answer how the hypothesis would apply to different populations, and if there is any reason the paper got a biased result.
- If you can, look at how any experiments were conducted, and if (and how) the author corrected for confounding elements. But this one already requires a lot of know-how.
Evaluating a paper involves having spent several years reading many papers and understanding where the field is and where the major challenges are.
Trying to read a single paper in isolation from all the others would lead to over interpretation of the results.
This is why many graduate courses often just involve reading the latest literature and discussing it as a group. Discussing a paper as a group, especially with the presence of a few senior people helps to put the paper in a larger context (because the group together have read more papers than an individual). This greatly accelerated a PhD students understanding of the current state of the art in order to define what are the most interesting next questions.
Overall though, most scientific papers are meant for scientists and the point of them is to figure out whether there is some new idea that can lead to new directions or inquiry. For non practicing scientists, review papers are the way to go because they summarise many papers and try to put them into the context of the open questions.
here are some good tips:
How to Read a Paper
S. Keshav (2016)
David R. Cheriton School of Computer Science, University of Waterloo Waterloo, ON, Canada
Usually on skimming and first read I'll flush out all the concepts and words I don't understand. I can then figure those out when reading about the paper. Reading what other people have written about it will help you understand more as well. Then, on the final reread I'm prepared to understand.
At this point, I'm either satisfied with some takeaway (usually much narrower than any press about the article) or I have doubts or questions. Search to see what others say about the doubts or questions. Then, read references around central or implausible claims. It's not really uncommon to find references that are dubious - e.g. the author cites a figure from another paper, you go to that paper and it gets the figure from somewhere that gets it from somewhere that gets it from someone's example of a different thing.
The idea of "legit" does not exist in science. There are hypotheses, created by observation of some phenomenon and an attempt at generalization. There are tests of the hypothesis, which are experiments aimed at disproving it. There's also accumulation of evidence against a hypothesis. Those that don't fall over due to contradictory evidence endure.
Just like nobody rings a bell at the top of a market, nobody blows a horn to announce the arrival of scientific truth. It's all subject to being overturned at any time by new evidence.
When encountering a new paper, this background process is running. Where along the trajectory does this paper lie? Is it contradicting previous findings? If so, how strong is the evidence? How reasonable are the experimental methods? How consistent with the evidence are the conclusions? Was a falsifiable claim ever formulated and/or tested in the first place?
1) is it a journal that I've heard of? Does it have a reputation?
2) who are the authors? what else have the published in the same vein?
3) who funded the research?
4) what sort of content does it survey? who does it cite?
5) how thorough is the study? Is it a low-power study that barely gets a p < 0.05 result?
there's a lot to look at beyond the content. The content itself is (of course) the most important part, but you might not be able to assess it without a doctorate. That is what peer review is for.
None of these are also necessarily on their own to write a paper off. But I'm going to trust something more that appears in the lancet vs some journal I've never heard of before.
I almost never give weight to a single empirical paper. I look at them as "hypotheses with data".
If the paper has multiple lines of converging evidence, across multiple samples and methods, it's better.
Replication is better still, and still better with preregistration of some sort.
Really the gold standard to me is meta analysis of multiple studies, trying to account for publication bias. RCTs, large samples, preregistration, multiple labs, skeptics, are all better.
There are degrees of legitimacy, and no silver bullet.
You lose a lot of nuance that way, but if you're researching a general topic that's not super niche, it can be a helpful overview.
That's also not for groundbreaking research since it usually takes years, but very little in science is groundbreaking like that anyway, and I doubt a layperson would be able to recognize the true breakthroughs on their own.
When I don't have those options I read carefully, particularly any evaluation methods. If it's machine learning I also know to look for mistakes in hyperparameter tuning.
That said, an alternate approach is to just read the best papers from well respected conferences and venues in your area, rather than cherry picking particular papers.
Also, I do recall a book Bad Science that I've had on my read list for a while now ;) It seems to cover the topic but since I didn't read it I can't vouch for it.
Here is a demo https://smort.io/demo/home
Just prepend smort.io/ before any arXiv URL to read. No login needed!
Look especially for meta analysis, where other scientists have gone through and critically evaluated a result, aggregated similar experiments, and checked publication bias.
"Warning Signs in Experimental Design and Interpretation" http://norvig.com/experiment-design.html
“Why Most Published Research Findings Are False”
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/pdf/pme...
- https://en.m.wikipedia.org/wiki/Why_Most_Published_Research_...
- https://scholar.google.com/scholar?hl=en&as_sdt=0%2C21&q=Why...