HACKER Q&A
📣 tonybaboni

Whats your ideal PhD workflow


I'll be starting a CS PhD (specifically comp bio/biomedical datascience) at Columbia this September and have been thinking a lot about what sort of workflow/tech stack would be best to use during this period of time. I've only ever worked in academic research and haven't always had the most structure which would usually resemble a waterfall type workflow.

So far I'm thinking kanban to organize ToDos, SciWheel for citation management, Evernote for general notetaking, overleaf for LaTeX, and Doom Emacs as an IDE all used in an Agile-esque way.

Are there any suggested organizational methods/tools that you might recommend?


  👤 bernulli Accepted Answer ✓
Excellent advice here on tools.

My biggest piece of advice, that resulted in my biggest push of productivity, was a daily routine tailored to how your performance varies over the day.

For me

(1) sitting down and writing (anything: paper, thesis, proposal, blog) first thing in the morning, before emails. You want the habit of writing, and feeling weird if you don’t. This is my peak performance for the day, the hardest thing for me.

(2) celebrate your writing with a nice coffee and find a nice place on campus to study a paper. My 2nd most demanding task.

(3) Lunchbreak with colleagues. Socialize, discuss ideas but also just fun.

(4) My low point after lunch, I’ll try to have meetings here or do some routine stuff (forms, reimbursement, news, internet).

(5) Exploratory research, programming, planning, ideation, discussions, brainstorming.

(6) Exercise, sport.

(7) Later in the day my programming would come back strongest towards the evening.

(8) If you have family, go see them and have dinner with them. If not, I’d get another coffee and see how far I can go.

EDIT: (9) Write the todo list for the next day and open the document you want to work on the next morning. You don’t want to waste any brain cycles in the morning deciding what to work on.

Of course this is different for everyone, but matching your productivity to the task will make it a habit and much more effective.


👤 lqr
My main workflow improvement over the course of my PhD has been making my papers reproducible. This means: I clone a github repo, then I execute a single command to re-run all computations, re-generate all figures and tables, and compile the final paper.

Any manual step like "resize the figure window until the proportions look good", or "run script A, then run script B", or "upload this file to Overleaf" creates an opportunity for your results (figures and tables) to become out of sync with your source code. This can cause big problems.

I use Makefiles, Anaconda, and Github CI to accomplish this [1], but the techniques don't matter. The point is the end goal.

More broadly, in research it's always tempting to do things quick and dirty. Conference deadlines loom. Code bases are usually small and short-lived. But in my experience, your 4-months-later future self will appreciate tests, documentation, etc. just as much as your 4-years-later future self.

You can still be quick and dirty when it comes to project scope. This is where you get most of the speedup anyway. Go ahead write a function that doesn't handle edge cases, but add a comment about it!

[1] https://github.com/jpreiss/reproducible_papers


👤 mkl
No matter how good the advice you get, you are likely to find you need to change strategy, possibly multiple times. It's hard to learn how to do research and development by being told things, which is why we have the apprentice-like PhD system.

Don't be afraid to change tack completely or give up on things. The most important thing I learned during my PhD (though too late) was how important disproving and discarding ideas quickly is. Repeatedly doing the simplest, quickest, easiest thing to disprove your current ideas can take you a long way. Persisting because you believe (without doubting and checking) is a recipe for wasted time and effort. This applies both to research and to systems for notes, citations, development, etc., and you shouldn't underestimate plain text files or even handwriting (even though paper can't be searched automatically, you can flick through and compare pages quickly, and can scan for backups and type up anything important). I wasted too much time trying to figure out and use technological ways of doing things. It's also often hard to drag collaborators into good technology practices.

I haven't continued in research, so take this with that grain of salt, but I don't think I'm saying anything inaccurate.

PS "you're" means "you are"; you want "your".


👤 mhh__
1. Learn to use Git, and find a way to use it for as much of your work as possible. Keep your notes in a git repository.

2. Instate some reasonable rules about managing data and figures, i.e. don't accidentally loose anything - I tend to get distracted easily, so I usually don't let my scripts even try and overwrite previous output.

3. Latency is awful, think about not using Overleaf for LaTeX (this depends on what kind of workflow you need) as it's quite slow. And if you do use Overleaf try to integrate it into some git workflow, it doesn't have to complicated, but PhD's are not done in a week so don't even allow any risk of getting lost in your own work a year or two down the line.

4. If your work is in any way at risk (e.g. Is your thesis - literally and figuratively - reliant on some non-fungible data?), keep backups.

I also think it's probably very easy to overthink this. If we were talking about PhD "workflows" I would focus on working effectively with the people around you (and put your career in the right hands), and finding ways to incentivize yourself to do what you need to do on time (Reading your post you sound organized, so that might not be a problem).

https://en.wikipedia.org/wiki/Social_technology (In industry, engineering is very often the easy part compared to actually building healthy and productive companies)


👤 kaitai
Read Robert Boice (Advice for New Faculty) and take his research to heart.

I can't recommend workflows; they all depend on your personality and habits. To get my dissertation done I wrote a $25 check (don't laugh, on $18k/year that was real money) to the main political party not of my choice and promised everyone I'd send it on the day I didn't do a page of honest writing on the dissertation draft. Worked better than any of the tools above.

Whatever tools you use: start before you are ready, stop before you feel done, wait patiently, and reach out to get feedback early.


👤 kowlo
My opinion:

Use what you know - it doesn't matter. PhD students often end up procrastinating by searching for the right productivity tools or trying to over-organise their "workflow".

Focus on the research question, use whatever software you already use for writing, and use whatever software you already use for back-ups. Back-ups are important - but you don't need to use some fancy solution, just something that works.

If you have to use LaTeX (if you are comfortable with it, then I advise that you do) then use Overleaf - only because it lets you get up and running instantly and work from anywhere with a browser.

Get used to writing and not aiming for perfection. Writing drafts can be tough if you feel that each paragraph you write has to be perfect or "good enough" before moving onto the next. You could end up spending a whole day writing a single or couple of paragraphs.


👤 DataJunkie
I am going to assume you will be taking classes for a while. Most of my advice is in regards to the dissertation.

This one is common sense, but I violated it. Pick an advisor that works in the area you are interested in. I ended up picking an advisor that I got along well with, and that had the same open-source philosophy that I had, but was not in my field of interest. While he was brilliant, it was bad. He ended up retiring and I had to pick a new advisor, this time in my field. I wish I worked with him the whole time because he was much more constructive regarding my field of research and probably would have helped me open more doors.

The biggest productivity win for me was Dropbox [about 5 years ago] (with the extended history option). I made several attempts at using Git, SVN etc. and Dropbox just made everything so much easier. The actual data I simulated took up TBs of space, so of course that wasn't in Dropbox, just my manuscript, research sources etc. I accidentally deleted chapters and the backup feature made them easy to retrieve. My work was synced on all of my devices.

After leaving a job, I moved to the mountains and dedicated 2 years to finishing the dissertation. Someone wrote a book about finishing your dissertation 15 minutes at a time. That's sort of what I practiced. During the day, I would keep active by mountain biking and hiking, and then in the evening I would spend 2-3 hours writing code or writing the manuscript.

It worked out nicely.


👤 ISL
No matter what, you'll be part of a team. That team might be just you and your advisor, or it may involve more people.

Work to understand that team and find ways to help it achieve its goals (among them will be success in your research).

It isn't so much about specific tooling as it is iterating to help the team find the right path forward.

Tools-wise: keep it simple, try to align with your group's existing tools, be flexible so you can iterate and adapt.

If you're already in contact with mentors/advisors at Columbia, ask them, not us! :).


👤 jaimie
Unplug. Exercise. My only good research was done during the time when I was physically and mentally fit. There is a tendency (especially in CS programs) to think that more hours will help you solve the problem. This is not always the case.

The other thing is to prioritize long-term tasks first. The biggest problem with a dissertation is that it is a large, complicated project, on a scale most students haven’t experienced before, even in a communal setting like software engineering. You really do have to wake up and start writing in order to get the work done. I found the old productivity blogs that championed 3 Most Important Things to be a decent way to frame the world.

For tooling, I found Todoist and Trello useful. Zotero for citation management. Having a lab that uses Slack (or another collaborative space) really accelerated my work.

Final note: don’t avoid your advisor if you don’t have the work done. Be up front about what you do and don’t have done. Your success is their success. When you succeed, you prove that they made a good selection. No one wants you to fail. Research is a moving target and you’re not always going to make deadlines. Figure out how to keep everyone in the loop so that you keep a mindset of learning at the front, rather than a mindset of “falling behind”.


👤 smitty1e
If you're Emacs savvy (Spacemacs user myself) why not use the org-mode tools to work the LaTeX?

Also, references in Zotero => https://www.zotero.org/

There is a good Android client for Zotero, so toting around research on a tablet is a cheap approach.


👤 elviejo79
It was a long tine ago that I was in academia, but the advice that I would hibe an student today would be: use beeminder.com to set a bet to yourself to write 500 words daily... on anything... just the habit of writing is extremely important. Try to use the 5 paragraph esssay or the a logical tree exposition.

Use org-roam for a zettlekastem notes (that count to the 500 word limit)

And trying to use Literate programming (org-babel) to make your articles reproducible.

Use nixos in your personal computer so that you can control exactly what dependencies and libraries your papers need.


👤 ivalm
For my PhD I have to say I just kind of winged it. However, nowadays I use notion.so for personal organization.

Basically I define schemas for types of data I want to handle (tasks (which you can view as kaban boards), papers I’ve read/want to read, ideas, meeting notes, documents) and then define them as notion databases. Extensively use tags to make things searchable, cross-reference, custom views (via linked databases).


👤 alksjdalkj
You'll get tons of solicited and unsolicited advice - a lot of it will be contradictory, a lot will actually be the advice-giver's personal preferences disguised as advice, a lot will be aspirational or abstract and not super helpful. Trust yourself to identify the useful bits but don't be afraid to disregard most of what you hear.

👤 hvocode
I wouldn’t overthink the tools. I found a mostly minimal toolset was sufficient when I did my PhD: actively used repositories to make sure I never lost my work (private repos so I didn’t have to care what I put in them), plain old emacs for editing, and a mix of Mathematica, Matlab, and Python for my numerical and plotting needs. Used physical sticky notes and cheap legal pads for notes and keeping on top of todo lists. I printed papers to read since I never found a digital method to maintain the deep focus I needed to fully read the papers. I generally discourage students from worrying much about tools: at the end of the day if you’re thinking more about a tool or workflow process than your research work, you’re not making progress towards the degree. Use whatever tools and methods allow you to be productive without thinking about the tools or methods.

👤 hoppyhoppy2
>Whats you're

*What's your


👤 rsfern
Since you’re an emacs user, I highly recommend org mode and org-ref for technical writing. For collaborative writing I’ll export to overleaf, but I’ve never found anything quite as good as org mode for organizing, building, and revising research writeups.

You don’t have to go all in on org for notes and TODO tracking (I’ve tried and haven’t made it stick).

https://github.com/jkitchin/org-ref


👤 timothylaurent
Check out https://obsidian.md/ for a note taking app. It's a local markdown file-based knowledge base. It is great for zettelkasten workflows since links are a first class citizen, it also has a rich plugin system to extend and customize your worklfow. It's also free for personal use.

👤 getpost
Not what you're asking, but my ideal phd workflow is to complete an open research problem as a homework assignment, then turn it in as a thesis.

"In 1939, a misunderstanding brought about surprising results. Near the beginning of a class, Professor Neyman wrote two problems on the blackboard. Dantzig arrived late and assumed that they were a homework assignment. According to Dantzig, they "seemed to be a little harder than usual", but a few days later he handed in completed solutions for both problems, still believing that they were an assignment that was overdue.[4][6] Six weeks later, an excited Neyman eagerly told him that the "homework" problems he had solved were two of the most famous unsolved problems in statistics.[2][4] He had prepared one of Dantzig's solutions for publication in a mathematical journal.[7] This story began to spread and was used as a motivational lesson demonstrating the power of positive thinking. Over time, Dantzig's name was removed, and facts were altered, but the basic story persisted in the form of an urban legend and as an introductory scene in the movie Good Will Hunting.[6]

Dantzig recalled in a 1986 interview in the College Mathematics Journal, "A year later, when I began to worry about a thesis topic, Neyman just shrugged and told me to wrap the two problems in a binder and he would accept them as my thesis."[8]"

https://en.wikipedia.org/wiki/George_Dantzig


👤 harles
I’d recommend using your own style of zettelkasten. Roam Research is an excellent tool for this, but it’s a matter of preference. Keep your to-dos light weight (I like Asana so I can add comments, but again personal preference). If you have a team size of n=1, then some sort of agile is just silly overhead.

Biggest thing I can recommend here is don’t conflate tools with process.


👤 devnull3
Tangential question: Always been curious what is more challenging 1) Find interesting & original problems to research or 2) Doing the actual research?

I can imagine the struggles of the research. Does the thought "Damn! I should have chosen a different problem" cross your mind when researching (which leads to question #1)?

PS: I am just a graduate with no intention of doing PhD.


👤 gugtude
Honestly if you can just stick with those tools you’ll already be in the top 1% most organised. Most students just wing it.

👤 hexomancer
I wrote a research-oriented PDF viewer during my own PhD. The ability to be able to quickly find and search references proved invaluable to me (obviously I am biased though :) ). Here it is:

https://github.com/ahrm/sioyek


👤 sideproject
On the tools side, saw this project the other day for annotating and collaborating on research papers. It's not live yet, but something like this would have been useful during my PhD days.

https://www.scholars.io


👤 templarchamp
If your research involves collaboration, may be you could ask in your department, for example, so it becomes easier to share and exchange thoughts? Just a thought.

👤 kevinslin
would recommend checking out https://wiki.dendron.so

it’s an open source, local first, markdown based note taking tool that is integrated with vscode/cli

(full disclosure: i’m the creator and happy to answer any questions as well on our discord: https://discord.gg/NDYd9nYkM3)


👤 elvyscruz
https://web.hypothes.is/ your friend for annotating the web..

👤 giantg2
My ideal PhD workflow is to not do it at all. The best stack workflow is what fits the data available to your company, which varies greatly.

👤 howolduis
use scihub and Zotero for papers (there is a scihub extension that downloads the articles).

Use github (or any VCS) for your papers and thesis to keep actual track of changes.

github has great kanban boards that u can use (just make private repo and create a project inside it)


👤 unixhero
By god damn have a backup strategy for your data and for your thesis doc.

👤 iflp
Obligatory link: a research to engineering workflow

http://dustintran.com/blog/a-research-to-engineering-workflo...