2.2 Docker tutorial for beginners
To efficiently building, testing and deploying environment, AI Labs leverages docker for the realization of federated learning. This section is a guide for engineers who are new to docker. You are able to skip this section if you are already familiar with it.
What and why Docker
Docker is the most popular and significant tool for software engineering. The concept of docker analogs to virtual machine, which let the environment replicated easily. The overall picture of docker contains:
- Docker repositories:
- A repository is a cloud library stores a wide range of docker images
- You can download (pull) and upload (push) docker images to this place
- It can be either public or private. A public and also the most well-known repository is DockerHub. However, AI Labs use Harbor as a private repository due to the privacy and security consideration.
- Docker images:
- An image is a template of an operating system, which is analog to *.iso files.
- Images are read-only and lifeless.
- Docker containers:
- An image can be activated into one or more containers, which are writable, alive and mutually independent.
- If a container crashed, complete the tasks, or be stopped manually, it will be changed from running state into a dead state.
- Containers run in memory so the modifications will missed when the containers are dead. However, one can commit a running container into an image to permanently save the modifications.
Despite the concept of docker is very close to virtual machines, there are several advantages of docker which completely prevail virtual machines:
- Shared layer structure:
Docker images share common layers so pulling or committing a new image is much faster than the first time it done since it just need to add different layers. - Console based:
Containers provides CLI only instead of GUI, so it conserves lots of memory and behaves much more efficient.
How to use docker
To begin with, install docker by following the official tutorial according to your operating system.
The mostly used commands are: (<image> can be image-ID or image name; <container> can be container-ID or container name)
Commands | Explanations |
docker images | List all images |
docker ps (-a) | List all containers |
docker search <image> | Search all images in the repository |
docker pull <image> | Pull the image from the repository |
docker rmi <image> | Remove the image from the local computer |
docker run --id --name <container> sleep inf | Run a container by the image |
docker stop <container> | Stop the running container |
docker rm <container> | Remove the dead container |
docker exec -it <container> /bin/bash | Login the container |
docker commit <container> <image> | Save a container as an image |
docker cp <container>:<path> <hostPath> | Copy a file from the host to the container |
docker cp <hostPath> <container>:<path> | Copy a file from the container to the host |
docker save -o <hostPath.tar> <image> | Output the image as a compressed file |
docker load --input <hostPath.tar> | Input the compressed file as an image |
The following example helps beginners be familiar with docker quickly.
docker images # List all images on the local machine. It will be empty initially. docker pull ubuntu:latest # Pull the image ubuntu:20.04 from DockerHub. docker images # List all images again and you will see the pulled one. # docker ps # List all containers. It will be empty initially. docker run --name foo sleep inf # Run a container from the image which named foo. docker ps # List all containers again and you will see the one you ran. # docker exec -it foo sleep inf # Login the container ls # List files touch a.txt # Create a empty text file ls # List files again and you can see a.txt exit # Logout the container # docker commit foo ubuntu:latest_v1 # Save the container back into an image docker stop foo # Stop the container docker ps -a # List all containers including death docker rm foo # Remove the container |
There are more hints while running a container:
- --gpus=all can use all host gpus
- -p <hostPort>:<containerPort> connect ports
- --rm: if a container is dead, it will be removed automatically.
- -v <hostPath>:<containerPath> means mount a volume.
- -w <path> specify the default working directory
- `sleep inf` is one of the most used command while running a container. It ensures the container not to terminate immediately.
Dockerfile
From now on, we can customize an image manually. However, it is procrastinate to rebuild an image from the original one because it cost lots of time to re-type commands again. In addition, using an image without explicit modifications put user's computer under security risks. Hence, building docker images by scripts is more safe and reassuring.
Create a text file named `Dockerfile`. Note that the name should not be others.
FROM ubuntu:latest # Run a container from the image RUN apt install python3-pip # Execute some commands in the container COPY ./foo.txt /app/foo.txt # Copy a file from the host to the container WORKDIR /home # Setting the working directory after login |
After we finish editing a Dockerfile, go to the current working directory and type `docker build -t <imageName> .` to build an image. You can run it and login to verify whether the container runs properly.
More about building a docker image
- Most of the installing python package command are consistent between desktops and containers. However, the following must be considered.
Desktops Containers pip install opencv-python pip install opencv-python-headless - The following red words warning are ignorable.
-
debconf: delaying package configuration, since apt-utils is not installed
-
Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
-
WARNING: You are using pip version 20.2.4; however, version 23.0.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
-
- Building an image without cache: `docker build --no-tag -t <imageName> .`.
- Please make sure your pip version is new so that it can install the packages required by FLaVor. e.g. pip>=20.0.2.
More about pushing a docker image
After PI pack its code into a docker image, PI needs to push it onto a cloud repository for AI Labs federated learning. The command is like `docker push <cloud-repo>`. Here are some FAQ.
- error creating overlay mount to /var/lib/docker/overlay2/grpc6nztkg753nod4obw019to/merged: invalid argument
Please restart your docker engine or update the docker engine to the latest version.