Docker nearly seems to be an industry standard by now. Some people treat it like an obvious choice, but it's not so obvious to me.
People say that Docker can run anywhere. It solves the infamous problem of "works on my machine". Despite what people claim, I have not found this to be the case. I developed containers on windows and then I still had to debug the containers when deploying on Linux. There were formatting issues, filenotfound issues, and chmod issues. I have spent so much time configuring docker, and been able to complete the same task in a VM in a fraction of the time.
Am I alone here? Am I doing it wrong? Is it the case that I am not the intended audience, and it's meant for larger teams?
Most people who use docker don't know how much computing power and network usage they are wasting. Companies dont care as long as they can show some revenue.
I can grantee 99% of projects using docker or any sort of containerization are wasting 99% of compute resources.
Just waiting for those low skilled people to reply how wonderful docker is and to say "if its not working you are not doing it right "
Its not perfect, but its good enough and there is a lot of support for it, so its easy to plug into my team.
I basically avoid Windows altogether when doing development now. Unless it's for video game development, which even then I wish I didn't have to use Windows.
Dockers containers are standard for a reason. In fact I can't imagine not using Docker as a daily driver nowadays. Within seconds I can spin up a sterile container for my development environment. Or application. Or a data store. Or a queue. And it's all isolated and seamlessly communicates with each other. I'm now even compiling Unreal Engine servers on Windows and deploying them on Linux machines using Docker containers.
Are you using Docker Compose by the way?
Which is good enough for me. Considering how it is easy to install, distribute and run and manage apps in Docker vs bare metal setting up and fighting issues in complex setups until your eyes are red.
> I developed containers on windows and then I still had to debug the containers when deploying on Linux. There were formatting issues, filenotfound issues, and chmod issues
You probably mean "building", not "deploying". If you've built it on windows, the container image will be the same on linux.
> I have spent so much time configuring docker, and been able to complete the same task in a VM in a fraction of the time.
Now imagine spending this time over and over and over again.
> Am I doing it wrong? Is it the case that I am not the intended audience, and it's meant for larger teams?
Have you been using tools like Vagrant or Ansimble?
I think the most general mistake is using docker "as essentially VM", with typical VM workflows, instead of a build automation and distribution tool.
This abstraction was quite appealing for developers and other who think we don't need to understand or pay attention to infrastructure anymore.
This caused the hype, which brought many organisations to chaos and subsequent driving away from docker. However bonuses have been paid, promotions for "innovations" have been executed.
Docker is a fine thing to play with, but it should be kept few thousand miles away from production.
Personally I don't see where it can fit, however large or small organisation is.
At work, we recently started using GitHub Codespaces for development. It uses Docker to set up the dev environment. It's been fantastic and legit has solved 99% of "works on my machine" problems.
What's causing your file-not-found issues? Surely it's not actually Docker, but rather some poorly written Dockerfiles.
Docker's core strength is that it provides your applications with process, network, and filesystem isolation on an individual host. Applications are very sensitive to their host environment and before the popularity of containerization a lot of effort was spent ensuring that host machines were provisioned in a manner that made them capable of supporting application needs. Things could often get hairy when deploying multiple applications to the same host due to conflicting lib versions, network ports, environment variables, file permissions etc. Things could also become precarious when there was a need to manage rollbacks across dependency upgrades on the host.
With docker (or a similar container solution), all of that pain is abstracted away, you don't have to think about the host machine at all, the container coddles your application so that it has a pristine working environment, conveniently configurable with environment variables.
Linux containers do that, not Docker. However, that statement isn't as universal as some readers might think. Linux containers are just that, for Linux. In all reality, it also depends on how the host kernel has been configured if you want two containers to run the same way. Windows has Windows Containers, and the BSDs have jails. They're all quite different because their host operating systems are all quite different. Windows and MacOS are usually running Linux containers in a hypervisor.
Docker when juxtaposed to manually burning container images and standing up containers as they exist from LXC is extremely useful. It also inspired the growth of the Docker Runtime.
In theory it is also better because you can manage a fleet of containers just by looking at their memory and CPU usage and not really caring about what's inside, but I think the reality is that k8s is so complex it's not clear the tradeoff of finer-grained management is worth the difficulty of setting it up. Maybe, depends on your use-case.
People are saying the issue is that you're running it on Windows but I would counter that running on macOS can be just as bad if you're trying to run x86 containers on an ARM box.
Docker is abused as a developer environment though and I hate seeing teams try and force everyone to develop inside a container that eventually gets merged with the service itself and turns into a 2GB image nightmare.
It can also help avoid issues when one person is working on multiple things that have conflicting dependencies.
Docker "solves" the problem of Linux system configuration, but the solution it provides is literally the equivalent of "shove it under the rug". It doesn't solve the problem; it just attemps to hide it.
The way I solve the problem is by eliminating it; by making programs that don't require any system configuration.
Most people like to program in an interpreted language and use a separate engine to store and manage the data. The language runtime only has a "debug" server, so you need a separate http server and you need to configure it so that it can execute scripts written in your favorite language.
So this means the system/environment needs to be setup so that 1) an http server is installed and configured 2) the specific version of the interpreter is installed on the system 3) the database engine is setup and configured. I wish it stops here, but it gets much worse.
So what I do instead is use libraries for all the above. I use a compiled language. The http server is a library. The database engine is a library. Therefore, nothing needs to be configured in the environment. Just upload the binary and run it.
Some years ago I had some epic battles with deploying some fancy software in several environments including online and internally on-premise servers.
I do like challenges but some experiences get old fast. Understanding why your software is failing when deployed, fixing deployment configuration files etc. The cognitive load becomes heavy having to remember, check and fix. On top of that deployment environments always differ enough to somehow require individual attention.
So, thats when the lightbulb switched on in my head. Using (docker) containers has completely removed these adventures in deployment. And I’m very happy about this, I don’t have to deal with that stress anymore.
Now, with more experience and a black belt in container kung fu, I’m understanding what the hype around kubernetes may mean but I’m also sensing that a David is around the corner to knock down this Goliath.
Running multiple heavy VMs which segment system resources gives you even better and more complete separation at the cost of being very performance intensive. The industry by and large has opted for containers as a compromise between reasonable separation and performance.
Also as Docker grew in popularity there has been an ecosystem of tooling built around it for container orchestration, image repositories, and development that make it worth using just to get access to the tooling that surrounds it.
The Dockerfile contains the steps for creating the image. With docker-compose, you can codify how to setup services.
This is better than the alternative of doing everything by hand, with manual commands and adhoc edits to random system config files. The manual way is error-prone and not easily repeatable.
I use docker-compose for our internal gerrit, wiki and XMPP server setup.
If I were to do this setup by hand e.g. on a real machine or VM with manual edits to random config files all over the place, it is not clear exactly what I need to backup or how I would rebuild the server if it were destroyed somehow.
With docker-compose, the data I need to backup is all in one directory on the host machine and the services/servers themselves can be recreated from the Dockerfile and docker-compose.
I resisted it at first but now I cannot do without, neither in team projects nor personal ones. It can be improved but it is better than the alternatives that I know of.
- installing Docker on the servers
- keeping Docker up-to-date
- keeping an private image registry (or renting one)
- scanning regularly Docker images for vulnerabilities
then Docker is fine. But if you are a solo developer, Docker is probably overkill. I mean, working with Docker doesn't just mean "write your Dockerfile". It means all the bullet points I wrote above. Writing the Dockerfile is the easy part. If I rent a machine on Digitalocean for $5/month, then I have put on the "infra engineer" hat and install Docker on that machine, keep it up-to-date, I also need a way to keep my Docker images and scan them...
I once spent 3 weeks with several colleges to use a multiple hundred page documentation to set up shit and the error was always a bad IP address or wrong configuration or missing values from arcane places.
For me the value of docker was clear from the beginning and love on first sight.
I just had this discussion with a colleague. It does, semi-half-assedly but quite effectively, what nix promises. You can make hermetic images of not just binaries (that'd only be as good as static linking) but of entire machines (sans Linux kernel, because you're running what the host runs, but that's always Linux so it's fine, and when it's not Linux you're not having a good day).
Wrap that up in a stackable system (build your own images atop of others) and a way to quickly run processes out of images, and you've got yourself a highly portable and reusable system for apps. Want to build an app to run on a random k8s node in AWS? shove it in a docker image and now you've just got to write some yaml.
Got a few python libraries that you want packaged atop of all of pandas+jupyter+numpy+whateverelse? Reference a stock jupyter prebuilt image (there are many official ones you can start with) and add thew few packages you want in a Dockerfile, and bam, you've got a redistributable data science appliance as a docker file.
I can build docker images on my laptop and then run them on my crappy celeron nas for months without thinking twice about it.
Now that docker is the mainstay of beginner/intermediate engineering efforts, it’s just kinda meh. It’s not containers it’s the way communities prop weird stuff up like Docker is some prerequisite for moving fast.. and hey options exist like Bazel and Nix (and hopefully more to come) that build reproducible systems without always needing a hypervisor.
And also, I find it super irritating to not get any simplicity out of it unless you docker-compose.yaml. Which makes docker feel like a tool without batteries included..
Once ppl start using it as a dev container too it’s like gross Just my opinion though it’s also a fine choice for ballin out so I respect any and all hustle at the end of the day. To each their own.
I can agree that the problems start on the edge between the host platform and the containers like networking and mounted volumes, conflicting user and group ids.
I've taken to using s6 init to create self contained application stacks inside containers with log rotation, cron jobs, start/stop ordering, etc. Months later I can return to prior work and spin up the whole thing up on the first try.
Do whatever, tag, push, deploy.
Yes, there are big missing pieces. The whole one "process" per container mantra is tragic. Those that spout it don't even have a correct understanding of the term "process." Aligning UIDs between hosts and containers is awful. Many other hang-ups and glitches.
Not perfect. VMs and package managers streamlined to the ease of use that Docker delivers would be better. Much dysfunction has its roots in trying to containerize things that should be VMs.
I’ve had one or two coworkers on Windows and they used WSL I think. Can’t comment on their experience.
The Dockerfile prescribes exactly what operating system and packages the container has. Docker Compose sets up dependencies in other containers and in most cases this was all that was required to run and test. It does require you to make some platform agnostic decisions on how you structures both but I find it easier than scripting.
Outside of that YMMV.
Yeah, don’t do that.
I don't see anything hype about it. You might just be frustrated with more of the minor issues.
It's not actually cross-platform as the containers run on the host kernel, meaning images are CPU-architecture-specific.
NPM also has this issue, where the OS architecture is important for some packages containing binaries
The major appeal of docker to me is it's a new/extra layer of security. You can run known vulnerable software inside and when it gets compromised. So what? You lack the tooling to pivot or escape the container.
Anecdote: One of companies I worked for, smallish team size, due to different tools written in different languages needing all kinds of dependencies, docker was invaluable making the infrastructure easy to run without having those dependencies in the infra itself.
I prefer to keep target operating systems uniform, so a docker layer isn't needed. Just an install script which downloads packages and copies systemd service files needed.
Docker fixed that.
It's not about Docker, it's a tool. But with Docker, you know how team work is done.
No. Probably not. Maybe, but also probably not, you may just not have had the "Ah-hah!" moment. A lot of technologies are like that.
Take any random Windows or Linux server someone else has built. Log in to it. Now tell me how it was built such that that build can be reproduced on a slightly different platform, e.g.: a different processor or base image.
Good luck with that.
Traditional operating systems are write only. Trying to get the configuration back out after the fact is effectively impossible in the general case.
Even if you have a "scripted process", "policy", "documentation", or "declarative state configuration", drift occurs anyway because of things such as disk corruption, updates applying out-of-order, different levels of updates, quick fixes/tests, etc, etc...
It is basically impossible to keep a large fleet of supposedly identical machines identical.
As a random example from the Windows world: I've seen Group Policy applied to user accounts such that when an admin user logs on to a server, that triggers an on-login install of some software. Could be anti-virus, a print utility or whatever. Suddenly pristine servers become... not so pristine.
People have tried to work around this kind of thing... clumsily.
For example, in the Windows world you can build standard operating environment (SOE) disk images for servers using a tool like System Center Configuration Manager (SCCM) task sequences. This can deploy Java runtimes, ODBC drivers, or whatever you business app needs. (Linux has things like Packer)
The principle is fine, the implementations however are terrible.
A complex SCCM Task Sequence might run for hours and then fail at step number 57 out of 83. Fix it, start from the beginning.
Oops, failed again after several more hours. Fix it, try again.
Success! Now step #62 is failing. Fix it, try again.
This can mean days down the drain. Wouldn't it be just awesome if the build system could take snapshots of the VM at each step, so that it could restart at the last known good step, making the edit-build cycle super fast?
Docker does that.
Build systems like SCCM task sequences or Packer aren't typically parameterised, or the parameters are "baked in" at build time, and/or get passed at annoying points such as a very early "network boot" step, which has technical and security limitations.
Docker does the right thing, simply passing in parameters as environment variables at runtime. This allows one image to be used in many environments, without changing it. If you test something in UAT, then production will by byte-for-byte identical (SHA256 hashed!), except for the minimum required config changes.
Typical build systems produce a single opaque binary image. This could contain anything, and the "link" to the source script that produced that build is immediately lost.
Docker maintains the step-by-step build script that lead to the final docker image, in the image itself.
Typical build/deployment systems require "full VMs" and associated images, which start from merely big at ~32 GB and go up to a way-too-big ~127 GB as is the typical case for servers in the cloud. These are mostly blank space, and need the wiggle-room for things such as after-the-fact updates.
Docker maintains an efficient content-addressed store of image layers. A small change on top of an existing chain of image layers might need just tens of megabytes. Deploying this to the production environment similarly requires only tens of megabytes, allowing very efficient blue-green deployments rolled out automatically via dev-ops. Existing deployments are maintained as-is and there's no side effects of a roll-back. Compared to this, updating any shared framework or system-wide installed component is very risky on any traditional operating system.
Traditional server management ends up growing "paperwork" and "processes" like crazy. This is required to work around the risks listed above. So for example, changes have to be submitted to a change control system, but this is not enforced in any mechanical way. The paperwork can say "X", but the physical change can be "Y". Or details may not be present. Or whatever.
Docker lets you check the "dockerfile" into Git. That's authoritative, versioned, and can be made subject to mandatory branch security policy such as forced code review. It can be set up such that no paperwork is needed. The code is the code, and if it's in the "main" branch it has been reviewed. The image it builds is the image that is in production, end of story.
DISCLAIMER: Having said all of the above, I have literally never deployed containers in production, despite working in the cloud for years. My issue is similar to yours: I'd have to convince a lot of people to change their fundamental workflow and embrace a "different way of doing things". This can be challenging, especially in large enterprise with unmotivated staff, unions, third-party support contracts, etc...
Nowadays almost all of the software that I write is in OCI compatible containers (typically Docker), because it lets you not care as much about what is inside of your container: if you follow the 12 Factor App principles, deploying a Java, .NET, Python, Ruby, Node etc. apps all looks the same from outside.
Same for healthchecks, same for scaling, same for restarts, same for exposing ports, same for limiting CPU/RAM resources. You could do a lot of it with either systemd or bunches of other configuration on a typical *nix distro (hopefully automated through Ansible or something like that), but it's just more convenient, if you buy into that workflow.
That said, for some tech stacks, Docker is hot garbage, for example, I've never worked with anything as annoying as PHP can be, line endings, file permissions and god knows what else messing everything up, especially if you're unlucky enough to have your dev box be on NTFS (e.g. Windows), about which I wrote a little rant called "Containers are broken": https://blog.kronis.dev/everything%20is%20broken/containers-...
Of course, I will still keep using containers, because that almost wholly decouples what OS my servers are running from what the applications need to run (I just need Docker or a similar runtime available for that OS) and I can have as many parallel instances of whatever I need (e.g. 5 MySQL instances, 4 PostgreSQL instances, 3 Node instances, 2 Redis instances etc.) all on the same node, each with their own configuration and limits. Both on the server and locally. The same goes for the apps that I might build and want to upgrade.
Seriously, you've no idea how liberating it is to be able to take bad or possibly insecure software and put it in its own little box where it can't mess too much up (especially relevant for something like Python, where the story around packaging software and environments can be a dumpsterfire, but also even simpler cases like Node/Ruby/JVM versions and PATH variables, as well as system packages) and do so easier than cgroups directly or systemd slices would let me.
I'd still like to ditch Windows for *nix on my dev box as well, but sadly some great software (MobaXTerm for example) and most games still don't work. Of course, one could just dual boot and thus also decrease the risks of games and such not being trustworthy, but for my personal stuff I'm kind of lazy and the *nix distro that sits on the other disk partition gets underused somewhat.
There were supposed to be a lot of benefits to using Docker:
1. Build your environment anywhere and it will work anywhere. This is mostly true. Basic isolation of dependencies is a huge headache-solver for C/C++ development, Python apps, and anything with a complicated build chain. Go is a notable exception - the Go build process really did solve this without containers.
2. Process/resource/cgroup isolation. You can run multiple containers (even untrusted ones) on the same metal and they can't interact. It's like a VM but instead of committing fixed amounts of RAM and disk space (which mostly sits idle), you let the Linux kernel handle it dynamically with limits. At least that was the dream. But without carefully managing properties/permissions this is a security nightmare. When people say "why Kubernetes" this is one real answer. Almost no one trusts container isolation for actual security in practice.
3. Layered builds. This was a killer idea - you build your container in steps. The first few change infrequently and the last one or two is your actual app. You can ship the big OS-heavy layers once and push frequent small updates (a few MB) and still keep your dev and runtime environments identical. This breaks down a bit in the real world because a lot of people don't bother to layer things well - they wind up with 100MB-1GB monsters being pushed around endlessly. But also it kind of doesn't matter - you can push 1GB monsters on your own network/VPC all you want and it's not really a big deal.
4. Declarative builds - theoretically you can recreate a bit-identical build based on just the Dockerfile. This is great for auditing third-party images and reproducibility. Theoretically. But people forget that apt-get install on Monday might be different than apt-get install on Wednesday. Debian does security updates that way and an image built 6 months ago won't match one you build today. It still works, though, so we pretend not to notice. And sharing image layers with another host is actually kind of hard. I imagine Nix solves a lot of this but I haven't had the pleasure of learning it yet.
So Docker has fallen short on some promises and met others. But the truth is I can't think of a better way to build some fussy C project (the first thing I look for is a Dockerfile, and often it's easier to make one than pollute my system with dependencies). I also can't think of a better way to distribute a Python app. And being able to do "docker run debian:bullseye" is pretty slick. Golang is another story - I think it's very reasonable to say "what's the point?" with a go binary.