YOLO on Azure Deep Learning Virtual Machine (DLVM – Linux)
In a previous post I covered setting up YOLO on an Azure DLVM. In this I'll cover setting up YOLO on a Linux (Ubuntu) DSVM/DLVM using Docker.
First spin up a new Deep Learning Virtual Machine on Linux - this is already setup with Nvidia GPU CUDA drivers and Docker:
Then ssh in and pull down my docker github repo: https://github.com/daltskin/dlvm-darknet - this uses a fork of https://github.com/AlexeyAB/darknet and includes an opencv dependency fix, enables GPU and CUDNN in the Makefile.
$ git clone https://github.com/daltskin/DLVM-Darknet
$ sudo docker build DLVM-Darknet/darknet -t darknet:latest
$ sudo docker build DLVM-Darknet -t dlvm-darknet:latest
$ sudo docker run --runtime=nvidia dlvm-darknet:latest
You should see some output that looks like:
Loading weights from yolov3.weights...Total BFLOPS 65.864Done!
seen 64
./data/horses.jpg: Predicted in 0.101588 seconds.
horse: 89%
horse: 98%
horse: 97%
horse: 91%
Troubleshooting
Using Docker should eliminate a lot of the dependency issues involved with setting up YOLO, however the Nvidia drivers on the host VM should be checked, by running the following command:
$ nvidia-smi
You should see some output like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26 Driver Version: 396.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000AC17:00:00.0 Off | 0 |
| N/A 43C P0 56W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
It is recommended you update the CUDA drivers on the host VM: /en-us/azure/virtual-machines/linux/n-series-driver-setup?toc=/azure/virtual-machines/linux/toc.json#cuda-driver-updates:
$ sudo apt-get update
$ sudo apt-get upgrade -y
$ sudo apt-get dist-upgrade -y
$ sudo apt-get install cuda-drivers
$ sudo reboot
You may need to do this if you don't see any output/predictions from running the darknet commands eg. if you only see something output like this (missing predictions):
seen 64
./data/horses.jpg: Predicted in 0.0101588 seconds.