Home
From one data scientist to another on how to utilize docker to make your life easier.
Any data scientist that ever had to set up an environment for a deep learning framework, knows that getting the combination of CUDA, cuDNN, deep learning framework and other libraries right is a frustrating exercise. With docker you still have to go through the pain of figuring out the right combination, but... And it is a BIG but! Once you have this blueprint called a docker image, you can use it on other machines as well; you will be up and running in seconds.
This is by no means supposed to be an exhaustive introduction to docker (docker can do a lot more!), but merely for getting you started on your journey.
Disclaimer: This introduction was written using a Linux host system. If you are using Mac OSX or WSL under Windows, your mileage may vary...
Sections#
To get started with docker, you should look at the following sections:
- Basics - explains the most common operations that you will need to know
- Dockerfile - explains the structure of the
Dockerfile
file which is used to create docker images - Best practices - what to keep in mind when creating and using images
- Repositories - where to find the base images that your own images will use
If you should encounter problems, then have a look here:
Just like with any programming language or complex framework, there are certain things that can make your life easier. Therefore do not forget to have a look at:
Once you get a handle on things, and you are getting tired of manually building images, you might want to look into automating your builds and maybe also run your own registry/proxy. In that case, have a look at the following sections:
Finally, if you need to orchestrate multiple docker images, you can have a look at:
About the content#
This page was generated using mkdocs. The source code itself is hosted at github.com/waikato-datamining/docker-for-data-scientists and licensed under CC-BY-SA 4.0.