HACKER Q&A
📣 gavribirnbaum

Best way to learn computational biology/immunology?


I am biomedical scientist considering a switch to computational biology (specially computational immunology). I would love to get to a point that I can get a job in it in the next 6 months. Any tips will be much appreciated.


  👤 __keshav Accepted Answer ✓
Computational biology is a pretty broad term. Usually things that have to do with computers & biology are bioinformatics or computational biology. Briefly, for bioinformatics you’d need things like C++ under your belt and interest to come up with ways to make things work really fast and optimal on huge sequencing datasets.

Comp bio is a super fun field to be in. For me it's mostly using computers to do biology. But it’s a mixture of domain knowledge in bio, a good grasp of stats, and a whole lot of programming (usually not terribly difficult tasks, though).

Basic python and R are what you absolutely need to know (I started with intermediate python, no R). To do comp bio well, you need to learn computational statistics. I can’t stress enough how much knowing statistics matters in this case because there are so many assumptions that all sorts of libraries make about sequencing data and you need to decide for yourself how you’ll go about things and produce good science.

On a practical level for comp bio, I suggest: 1. Learning python & R 2. Basic knowledge 3. Knowing what your fave labs use for techniques (eg NGS? What kind of NGS?) and learning how it works, 4. Learning probability & statistics (lin alg always helps too) 5. If you got bored, learn clustering methods... Because, good god people in this field love seeing pretty tSNE figures and 98% of them have no idea how they just produced what they did but make biological assumptions based on it. You’ll probably have to learn them anyway


👤 emiller88
The resource I recommend to people looking to move from wet lab to dry lab stuff is https://www.biostarhandbook.com/. From your post history it looks like you already have some programming experience, so you could skip the first few chapters which are just a linux intro. I don't think it has all the best practices, but I think it's the most comprehensive overview that starts from square 1 and fills in all the gaps no one tells you when you first start, for example the "Common data types" chapter.

👤 pandatigox
I think one of the easiest ways to get into it is by knowing how to use a software called MaxQuant Perseus (https://maxquant.net/perseus/). It's like advanced Excel that was designed so scientists don't have to learn R but still get the job done. Good luck with your journey!

👤 anderspitman
I think the main question need to answer for yourself is whether you're more interested in biology or programming, ie do you want to use software tools to do biology (computational biology), or do you want to make the tools that others use (bioinformatics)?

If biology, you need to focus on bio, stats, Python, R, and a hundred other specialized tools for working with data.

If you're more interested in programming, you can get away with much less bio/stats knowledge, unless you're working on developing low-level algorithms. A lot of the work has more to do with efficiently storing, moving, and visualizing large datasets. Bonus here is that much of this knowledge is transferable to other (much higher paying) domains if you get burned out or want to sell out.

My current job could be described as bio-aware web development, with an emphasis in data visualization. I need to know a decent amount of biology, but I can almost always defer stats to others in the lab with more expertise.


👤 f6v
Knowing how to do programming is a must: Python, R(both are quite popular). Being hands-on with Linux helps as well, as many real-world datasets won’t fit your laptop, so you’ve got to use high-performance computing infrastructure. But it’s mostly about being able to make inferences from data. You need a solid stats background for that.

There’s a ton of courses online and https://www.edx.org/bio/rafael-irizarry is a good start.


👤 dannykwells
The vignettes for Seurat are the place to start.

Also, I'm a founder at Immunai and this is literally what we do. Please dm if you have further questions. Happy to help however I can.


👤 netizen-9748
Are you specifically interested in immunology? Computational biology is a pretty broad umbrella. My personal experience was learning bioinformatics, which was heavy on Python and genomics/proteomics.

A brief search for comp immunology turns up things like data mining and mathematical modeling, I would assume Python and R would be a good place to start. You may even be able to find some lectures that cover some of the basics online.


👤 psyklic
There is a very nice set of bioinformatics coding challenges here: http://rosalind.info/problems/locations/

👤 warlog
dn/dt

And then

dn/ds

Learn everything that leads to and comes from these two equations.