HACKER Q&A
📣 KingOfCoders

Best empirical papers on software development?


There are some good empirical papers, but I only know very few. What is your best empirical paper on software development?


  👤 hwayne Accepted Answer ✓
These days I've been really keen on qualitative studies, where scientists work on building a better understanding of software instead of trying to validate theories. Some examples of that:

https://link.springer.com/article/10.1007/s10664-016-9464-2 — scientists followed a team of developers for three years and recorded all of their sprint retrospectives and people kept forgetting what they already learned.

https://jlubin.net/assets/oopsla21.pdf — researchers compiled 23 hours of Zoom sessions and 15 hours of Twitch programming livestreams to study how people write code in static FP languages.

In addition to that, there's a lot of good work on how we teach programming. "Commonsense Computing" [1] found that students understand concurrency a lot faster when presented as a "human" problem, such as selling tickets in a concert. I'd recommend reading Teaching Tech Together (http://teachtogether.tech), which references a lot of empirical papers on teaching programming.

[1]: https://cseweb.ucsd.edu/classes/fa08/cse599/Papers/ICERConcu...


👤 Qualadore
In terms of methodological quality, "Fixing Faults in C and Java Source Code: Abbreviated vs. Full-Word Identifier Names" [0] is a favorite of Hillel Wayne's [1].

> Manageable scope.

> They did their homework.

> They mix qualitative and quantitative methods.

> Objective measure of Defects.

> Really, _really_ good experimental setup.

> And then an ethnography.

[0] http://www2.unibas.it/gscanniello/Giuseppe_Scanniello%40unib...

[1] https://www.hillelwayne.com/post/the-best-se-paper/


👤 bramblerose
I quite like https://neverworkintheory.org/reviews/ which has summaries of recent empirical research papers

👤 ta238911
A great recap of some papers: Software engineering's greatest hits.

* https://www.youtube.com/watch?v=HrVtA-ue-x0 * slides: https://third-bit.com/talks/greatest-hits/#1


👤 pid-1
Here's a few I like:

Why Do Computers Stop and What Can Be Done About It? https://pages.cs.wisc.edu/~remzi/Classes/739/Fall2018/Papers...

Not a paper, but the whole book "Accelerate" presents many empirical findings related to automating software operations.

Hidden Technical Debt in Machine Learning Systems - NIPS https://papers.nips.cc/paper/2015/hash/86df7dcfd896fcaf2674f...


👤 okasaki
You might like to review the book Evidence-based Software Engineering, which is freely available online: http://www.knosof.co.uk/ESEUR/

👤 eimrine
Mythical man-month - this is a short book written dozens of years ago but from my opinion it is timeless.

👤 smaddox
Technically not a paper, but Casey Muratori takes a very empirical approach to software development, which I find extremely practical:

- https://caseymuratori.com/blog_0015

- https://caseymuratori.com/contents


👤 geekjock
I write and share summaries of papers related to developer productivity here: https://abinoda.substack.com

Two of my favorites so far are https://homepages.dcc.ufmg.br/~figueiredo/disciplinas/papers... and https://faculty.washington.edu/ajko/papers/Li2019WhatDisting...


👤 eclarkso
I have always appreciated https://www.cs.cmu.edu/~NatProg/papers/Ko2008JavaWhyline.pdf - an empirical analysis of debugging and a tool based on those findings, and an evaluation of that tool.

👤 arijr
Hard question, there's a lot; one favourite I came across years ago on one uni. course when I was finishing my SW Eng. MSc and while had I worked in Global SW Dev for years. The research was so well put together on a difficult to measure topic:

An empirical study of speed and communication in globally distributed software development

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.324...


👤 Pelam
The studies referred in this book

https://www.amazon.com/Accelerate-Software-Performing-Techno...

Nicole Forsgren PhD and 2 more Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations


👤 divbzero
Are there studies comparing total development effort to build and maintain software using types vs. no types? You would need to use closely related languages like TypeScript vs. JavaScript, or something like Python with vs. without type hints.

👤 westurner
From https://en.wikipedia.org/wiki/Experimental_software_engineer... :

> Experimental software engineering involves running experiments on the processes and procedures involved in the creation of software systems, with the intent that the data be used as the basis of theories about the processes involved in software engineering (theory backed by data is a fundamental tenet of the scientific method). A number of research groups primarily use empirical and experimental techniques.

> The term empirical software engineering emphasizes the use of empirical studies of all kinds to accumulate knowledge. Methods used include experiments, case studies, surveys, and using whatever data is available.

(CS) Papers We Love > https://github.com/papers-we-love/papers-we-love#other-good-... :

- "Systematic Review in Software Engineering" (2005)

-- "The Developed Template for Systematic Reviews in Software Engineering"

- "Happiness and the productivity of software engineers" (2019)

DevTech Research Group (Kibo, Scratch Jr,) > Publications https://sites.bc.edu/devtech/publications/

' > Empirical Research, instruments: https://sites.bc.edu/devtech/about-devtech/empirical-researc...

"SafeScrum: Agile Development of Safety-Critical Software" (2018) > A Summary of Research https://scholar.google.com/scholar?cites=9208467786713301421... (Gscholar features: cited by, Related Articles) https://link.springer.com/chapter/10.1007/978-3-319-99334-8_...

Re: Safety-Critical systems, awesome-safety-critical, and Formal Verification as the ultimate empirical study: https://news.ycombinator.com/item?id=28709239


👤 mtmail
Can you list the ones you know/found?