HACKER Q&A
📣 jasonjmcghee

What do you use to communicate data analysis?


I’m really curious what people tend to use to communicate the findings of research, analytics, ml experiments, infra costs / errors / usage etc.

Do most people use Google Slides? Or write a doc in Google or Notion? Or send around a notebook?


  👤 rapjr9 Accepted Answer ✓
I preferred to use a web site. If you send data/analysis via email or documents then people lose them, can't find them, ask you to send them again. If it is on a web site they can always find it. If there is new data/analysis they can always find it in the same place. I used password protected pages if we didn't want the data to reach the public. A web site is also great when collaborating with another team or multiple teams across the world. I've used wiki pages also, which makes it somewhat easier to get the data into a presentable format, although not as useful for presentations as the types of formatting are limited. Using a content/version management system (cvs, git) is useful for archiving data and analysis and making sure it is timestamped and unaltered while still making it widely available, though only to people who are comfortable using such systems. Spreadsheets are my least favorite way of sharing data and analysis, they are so limited (try creating a billion row spreadsheet) but were sometimes necessary for generating graphics for papers. Really big datasets or very long time series graphs are a problem unto themselves, they typically can have only one primary residence (with backups) on a dedicated server and custom software to generate and view the analysis. Sending copies usually means sending hard drives. Raw data files are preferable to putting data into databases because you can write very fast code to do data processing. When you inevitably have to reprocess the data it goes quickly, which can be a lifesaver when faced with a deadline. Automatically regenerating the graphs, etc., is really useful also. Real time, live analysis generated from continuously streaming data generally requires something custom and depends on who needs to see the data/graphs.

👤 uptownfunk
Depends on audience - If speed matters then just screen grabs of whatever I'm working in (Jupyter or RStudio) into an Excel workbook

If I have more time and it's more formal, then charts in a PPT

If more technical, then notebook (markdown)


👤 kristenkehrer
Depends on the audience. Anyone non-technical will see a lovely PPT. Other technical teammates will see my CometML dashboard, but may also see the PPT in addition if there is business impact that has been quantified and needs to be shared. I'd only be sharing a notebook if I was helping someone do something similar, I don't normally do anything fancy with my notebooks.

👤 alexmolas
Depending on the audience I use slides or notebooks. If it's a technical audience and people want to hear the details I use jupyter, because it allows me to mix code, markdown and visualisations. If the audience is only interested in the main results, then I use some slides with the principal results.

👤 jstx1
Is the file format really important?