2.2 Docker tutorial for beginners

    To efficiently building, testing and deploying environment, AI Labs leverages docker for the realization of federated learning. This section is a guide for engineers who are new to docker. You are able to skip this section if you are already familiar with it.

    What and why Docker

    Docker is the most popular and significant tool for software engineering. The concept of docker analogs to virtual machine, which let the environment replicated easily. The overall picture of docker contains:

    • Docker repositories:
      • A repository is a cloud library stores a wide range of docker images
      • You can download (pull) and upload (push) docker images to this place
      • It can be either public or private. A public and also the most well-known repository is DockerHub. However, AI Labs use Harbor as a private repository due to the privacy and security consideration.
    • Docker images:
      • An image is a template of an operating system, which is analog to *.iso files.
      • Images are read-only and lifeless.
    • Docker containers:
      • An image can be activated into one or more containers, which are writable, alive and mutually independent.
      • If a container crashed, complete the tasks, or be stopped manually, it will be changed from running state into a dead state. 
      • Containers run in memory so the modifications will missed when the containers are dead. However, one can commit a running container into an image to permanently save the modifications.

    Despite the concept of docker is very close to virtual machines, there are several advantages of docker which completely prevail virtual machines:

    • Shared layer structure:
      Docker images share common layers so pulling or committing a new image is much faster than the first time it done since it just need to add different layers.
    • Console based:
      Containers provides CLI only instead of GUI, so it conserves lots of memory and behaves much more efficient.

    How to use docker

    To begin with, install docker by following the official tutorial according to your operating system.

    The mostly used commands are: (<image> can be image-ID or image name; <container> can be container-ID or container name)

    Commands Explanations
    docker images List all images
    docker ps (-a) List all containers
    docker search <image> Search all images in the repository
    docker pull <image> Pull the image from the repository
    docker rmi <image> Remove the image from the local computer
    docker run --id --name <container> sleep inf Run a container by the image
    docker stop <container> Stop the running container
    docker rm <container> Remove the dead container
    docker exec -it <container> /bin/bash Login the container
    docker commit <container> <image> Save a container as an image
    docker cp <container>:<path> <hostPath> Copy a file from the host to the container
    docker cp <hostPath> <container>:<path> Copy a file from the container to the host
    docker save -o <hostPath.tar> <image> Output the image as a compressed file
    docker load --input <hostPath.tar> Input the compressed file as an image

    The following example helps beginners be familiar with docker quickly.

    docker images # List all images on the local machine. It will be empty initially.

    docker pull ubuntu:latest # Pull the image ubuntu:20.04 from DockerHub.

    docker images # List all images again and you will see the pulled one.

    #

    docker ps # List all containers. It will be empty initially.

    docker run --name foo sleep inf # Run a container from the image which named foo.

    docker ps # List all containers again and you will see the one you ran.

    #

    docker exec -it foo sleep inf # Login the container

    ls # List files

    touch a.txt # Create a empty text file

    ls # List files again and you can see a.txt

    exit # Logout the container

    #

    docker commit foo ubuntu:latest_v1 # Save the container back into an image

    docker stop foo # Stop the container

    docker ps -a # List all containers including death

    docker rm foo # Remove the container

    There are more hints while running a container:

    • --gpus=all can use all host gpus
    • -p <hostPort>:<containerPort> connect ports
    • --rm: if a container is dead, it will be removed automatically.
    • -v <hostPath>:<containerPath> means mount a volume.
    • -w <path> specify the default working directory
    • `sleep inf` is one of the most used command while running a container. It ensures the container not to terminate immediately.

    Dockerfile

    From now on, we can customize an image manually. However, it is procrastinate to rebuild an image from the original one because it cost lots of time to re-type commands again. In addition, using an image without explicit modifications put user's computer under security risks. Hence, building docker images by scripts is more safe and reassuring.

    Create a text file named `Dockerfile`. Note that the name should not be others.

    FROM ubuntu:latest # Run a container from the image

    RUN apt install python3-pip # Execute some commands in the container

    COPY ./foo.txt /app/foo.txt # Copy a file from the host to the container  

    WORKDIR /home # Setting the working directory after login

    After we finish editing a Dockerfile, go to the current working directory and type `docker build -t <imageName> .` to build an image. You can run it and login to verify whether the container runs properly.

    More about building a docker image

    • Most of the installing python package command are consistent between desktops and containers. However, the following must be considered.
      Desktops Containers
      pip install opencv-python pip install opencv-python-headless
       
    • The following red words warning are ignorable.
      • debconf: delaying package configuration, since apt-utils is not installed

      • Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

      • WARNING: You are using pip version 20.2.4; however, version 23.0.1 is available.

        You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.

    • Building an image without cache: `docker build --no-tag -t <imageName> .`.
    • Please make sure your pip version is new so that it can install the packages required by FLaVor. e.g. pip>=20.0.2. 

    More about pushing a docker image

    After PI pack its code into a docker image, PI needs to push it onto a cloud repository for AI Labs federated learning. The command is like `docker push <cloud-repo>`. Here are some FAQ.

    • error creating overlay mount to /var/lib/docker/overlay2/grpc6nztkg753nod4obw019to/merged: invalid argument
      Please restart your docker engine or update the docker engine to the latest version.

    Taiwan AI Labs (AILabs.tw) Copyright © 2023Powered by Bludit