HACKER Q&A
📣 bendee983

Is PyTorch better than TensorFlow for production?


A recent article by InfoWorld claims many enterprises are moving from TensorFlow to PyTorch:

https://www.infoworld.com/article/3597904/why-enterprises-are-turning-from-tensorflow-to-pytorch.html

The article seems a bit lopsided:

- There's only a quote from Facebook AI and no quotes from Google's TF team

- I'm pretty sure you can find several examples of organizations that were more comfortable with TF over PT.

But the article cites some facts from company execs that I wanted to check with the community here and see whether anyone else has had the same experience:

“The TensorFlow object detector brought memory issues in production and was difficult to update, whereas PyTorch had the same object detector and Faster-RCNN, so we started using PyTorch for everything.”

"On distributed compute [PyTorch] really shines and is easier to use than TensorFlow, which for data parallelisms was pretty complicated.”

“At the start we used TensorFlow and it would crash on us for mysterious reasons. PyTorch and Detecton2 was released at that time and fitted well with our needs, so after some tests we saw it was easier to debug and work with and occupied less memory, so we converted."

There are also several allusions that the PyTorch community is more active and supporting than TF.

To be clear, I love PyTorch, but I don't have enough production-level experience to confirm or reject these claims, and TF is not well-presented in the article.


  👤 lacker Accepted Answer ✓
This question is just way too general. You will find many organizations successfully using both TensorFlow and PyTorch in production.

In my opinion, honestly the libraries are very similar in their capabilities. Either one is a fine choice. If you aren’t familiar with either, I would suggest PyTorch as easier to learn, but if you have more experience with one then by all means stick with it.


👤 Jugurtha
As far as I'm concerned, it is not about mutual exclusion. That is from the perspective of someone involved in building a machine learning platform[0] to allow users to use whatever they want to use in their experiments without being bothered with setting up their environnement, or differences in experiment tracking, or differences in deployment.

We used to build custom products and while the multitude of frameworks has been useful, the fractures and fragmentation dealing with their peculiarities has hurt us. Now we build around differences to provide a cohesive experience.

I think a useful way to look at these frameworks is different from the way we look at Web dev frameworks. You don't change framework every day in Web dev, but in machine learning, you want to experiment with as many things as necessary to get your results. TF, PyTorch, MXNet, Sklearn, XGBoost, whatever gets results. The problem is that this experimenting takes a toll and is painful.

I believe that progress in a field is inversely proportional to the duration and cost of experiments. We're trying to reduce these to their systematic, irreducible, values.

- [0]: https://iko.ai


👤 t-vi
I'm not neutral, but so:

Just as there are many different applications of deep learning, "Production" is quite heterogeneous (you put something behind a server, or integrate it in a large codebase written in $foo, or put it on some embedded device or phone, want to deploy on one or several machines with or without GPUs or other HW accelleration,...). I would not expect a single tool or method to be universal. This means that I would not expect either to be uniformly better than the other. But it also means that a tool has an edge if it allows deployment in a wide range of ways. In my view, it is nice that PyTorch has achieved a high level of consistency between the Python interface and other ways of using / deploying PyTorch. Also, PyTorch does have some vision of interop (eg with ONNX).

The other part is that when looking at a complete lifecycle, with detecting / analysing potential update needs, updating models and deploying the updates, it would seem beneficial to be able to seemlessly go back and forth. In my experience, PyTorch does a good job here (although I might also have ideas for further improvements), in that you can take, say, a JIT exported module and then look at it and it has much of the structure of how it was originally written. Taking a, say, tflite exported model and looking at what it does seemed to involve looking at low-level things much more.


👤 SomeoneFromCA
I used Darknet (C library) and good old C++ and CUDA custom written NNs. I've never had any success with Python frameworks, they all felt fragile and not very performant.

👤 villgax
Pick a choice after you glance over unresolved issues for features you need from either.