Docker Task in HPC Pack
HPC Pack docker task is the task running in a docker container.
To use this feature, set the task environment variable CCP_DOCKER_IMAGE
to indicate a docker image, which will be used to start a docker container to run the task. The format is like: CCP_DOCKER_IMAGE=[Docker Registry]<Repository>[:Tag]
Besides, there are several environment variables could also be used to enhance this feature.
CCP_DOCKER_NVIDIA
(Linux only) to indicate if using commandnvidia-docker
, instead of usingdocker
, to start docker container. For example,CCP_DOCKER_NVIDIA=1
.CCP_DOCKER_VOLUMES
to set the directories to be mounted from host to docker container as volumes. For example,CCP_DOCKER_VOLUMES=/host_path1:/container_path1,/common_path,/host_path2:/container_path2:z
orCCP_DOCKER_VOLUMES=c:\foo:c:\dest,c:\foo:d:
.CCP_DOCKER_DEBUG
to indicate if leaving the container alive for debugging after the command in it finishes, the container needs to be removed manually later. For example,CCP_DOCKER_DEBUG=1
.CCP_DOCKER_START_OPTION
to add additional options when starting a docker container. For example,CCP_DOCKER_START_OPTION=--network=host --ulimit memlock=-1
.CCP_DOCKER_SKIP_SSH_SETUP
(Linux only) to indicate if skip the default way (use the SSH keys and network of host in container, stop SSH server on host and start SSH server in container) to setup SSH communication between containers, which should be set if the docker image handles this. For example,CCP_DOCKER_SKIP_SSH_SETUP=1
.
To run docker task, the docker application should be installed on the Windows/Linux compute nodes as prerequisite.
When a docker task being allocated with multiple Linux nodes to run MPI application, no other MPI docker task should use these nodes simultaneously because the container on each Linux node shares network with its host. Running MPI application in docker task on Windows compute nodes is not supported yet.
The Linux OS in docker image should has /bin/bash
.
To run MPI application in docker containers on Linux nodes, the docker image should has sudo
, SSH service
and MPI
installed.
Run docker task on Linux compute nodes step by step
Deploy cluster with ARM template
Use Single head node cluster for Linux workloads to deploy the cluster
Install docker on Linux compute nodes
Install Docker CE following docker docs with clusrun
clusrun /nodegroup:linuxnodes /interleaved yum -y update clusrun /nodegroup:linuxnodes /interleaved yum install -y yum-utils device-mapper-persistent-data lvm2 clusrun /nodegroup:linuxnodes /interleaved yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo clusrun /nodegroup:linuxnodes /interleaved yum install -y docker-ce docker-ce-cli containerd.io clusrun /nodegroup:linuxnodes systemctl start docker
In this doc, the version of Docker CE is 17.11.0-ce-rc3.
Run command in container as docker task
Submit a job containing 1 docker task
job submit /env:ccp_docker_image=docker.io/library/ubuntu:16.04 hostname
Check job result in HPC Pack Cluster Manager:
Submit a job containing multiple docker tasks
job new job add !! /env:ccp_docker_image=ubuntu cat /etc/*release ^| grep ^^NAME job add !! /env:ccp_docker_image=centos cat /etc/*release ^| grep ^^NAME job add !! /env:ccp_docker_image=debian cat /etc/*release ^| grep ^^NAME job add !! /env:ccp_docker_image=fedora cat /etc/*release ^| grep ^^NAME job submit /id:!!
Check job result in HPC Pack Cluster Manager:
Tasks would inherit the environment variables of their job if they don't have the same ones, thus the docker image can also be assigned in job environment variables
job new /jobenv:ccp_docker_image=ubuntu job add !! hostname^; cat /etc/*release ^| grep ^^NAME job add !! hostname^; cat /etc/*release ^| grep ^^NAME job add !! hostname^; cat /etc/*release ^| grep ^^NAME job add !! /env:ccp_docker_image=centos hostname^; cat /etc/*release ^| grep ^^NAME job submit /id:!!
Check job result in HPC Pack Cluster Manager:
Run MPI docker task
Build customized docker image with MPICH installed
Perform this step in any Linux node with docker installed.
Start a container with docker image
ubuntu
:docker run -it ubuntu
Install
sudo
,ssh
,vim
andmpich
withapt-get
:apt update; apt -y install sudo ssh vim mpich
Write a simple MPI program:
mkdir /mpisample chmod o+w /mpisample cd /mpisample vim helloMpi.c
Edit
helloMpi.c
with below content:#include<mpi.h> #include<stdio.h> int main(int argc, char** argv) { int rank, size, processor_name_length; char processor_name[1000]; MPI_Init(NULL, NULL); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Get_processor_name(processor_name, &processor_name_length); printf("Hello from %s, rank %d out of %d processors.\n", processor_name, rank, size); MPI_Finalize(); }
Compile
helloMpi.c
withmpicc
and create a shell scriptrun.sh
:mpicc helloMpi.c -o helloMpi vim run.sh
Edit
run.sh
with below content:#!/bin/bash echo $CCP_NODES | tr " " "\n" | sed "1d;n;d" | cat > host_file num=$1 [ -z "$num" ] && num=4 mpirun -n $num -f host_file ./helloMpi
Set the execution permission of
run.sh
and exit the docker container:chmod +x run.sh exit
Commit and push the docker image to docker hub:
docker commit $(docker ps -qa -n 1) <docker hub account>/mpich docker login -u <docker hub account> -p <password> docker push <docker hub account>/mpich
A docker hub account is needed to perform this operation.
This is the way to build docker image manually, alternative is using Dockerfile.
Run MPI task
Submit a job including a docker task to run the MPI application in the docker image we built in above step:
job submit /env:ccp_docker_image=<docker hub account>/mpich /numnodes:4 /workdir:/mpisample ./run.sh 16
Run MPI docker task with docker image as SSH server
Create docker image as SSH server
Create
Dockerfile
for building a docker image with Ubuntu containing SSH keys for root user, which will be started as a SSH server with port3022
:FROM ubuntu RUN apt-get update RUN apt-get install -y sudo ssh mpich RUN ssh-keygen -t rsa -N "" -f /root/.ssh/id_rsa RUN cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys RUN echo "Port 3022" >> /root/.ssh/config RUN mkdir /run/sshd ENTRYPOINT ["/usr/sbin/sshd", "-D", "-p", "3022"]
Build and push the docker image to docker hub:
docker build -t <docker hub account>/ubuntu_mpich_as_ssh_server . docker login -u <docker hub account> -p <password> docker push <docker hub account>/ubuntu_mpich_as_ssh_server
Run MPI tasks
Submit a job including a docker task to run MPI application in the docker image we built in above step with command
mpirun -machinefile $CCP_MPI_HOSTFILE hostname
and environmentsCCP_DOCKER_IMAGE=<docker hub account>/ubuntu_mpich_as_ssh_server,CCP_MPI_HOSTFILE_FORMAT=1,CCP_DOCKER_SKIP_SSH_SETUP=1,CCP_DOCKER_START_OPTION=--network=host
:Check task output
Run docker task on Windows compute node step by step
Add Windows compute node into cluster
Add a Windows compute node with container installed image by Burst to Azure IaaS VM
Run command in container as docker task
Submit a job containing a docker task allocated to Windows compute node
job submit /requestednodes:IaaSWinCN000 /env:CCP_DOCKER_IMAGE=mcr.microsoft.com/windows/servercore:ltsc2016 ping -t localhost
Peek task output
Cancel the job
job cancel !!