写在前面

Pass Nvidia GPUs Docker.

2024年7月8日
https://chatgpt.com/share/6537ce71-c89a-4d9b-ab3d-fc3582c1c559

Pass Nvidia GPUs Docker.

How do I pass nvidia gpus through to a docker instance?

ChatGPT 说：

ChatGPT

Passing NVIDIA GPUs through to a Docker instance requires a few steps, including installing the NVIDIA Container Toolkit. Here’s a step-by-step guide to help you set this up:

Prerequisites

NVIDIA Driver: Ensure you have the NVIDIA driver installed on your host machine.
Docker: Ensure Docker is installed on your host machine.
NVIDIA Container Toolkit: This toolkit is necessary for Docker to recognize and use NVIDIA GPUs.

Step 1: Install NVIDIA Driver

Ensure that the appropriate NVIDIA driver for your GPU is installed on your host machine. You can usually install it from the NVIDIA website or through your Linux distribution’s package manager.

Step 2: Install Docker

If you don’t already have Docker installed, you can install it by following the instructions on the Docker website.

Step 3: Install NVIDIA Container Toolkit

Add the package repositories and the GPG key:

bash

复制代码

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Update and install the NVIDIA Docker package:

bash

复制代码

sudo apt-get update sudo apt-get install -y nvidia-docker2
Restart Docker to load the new configuration:

bash

复制代码

sudo systemctl restart docker

Step 4: Verify Installation

Run the following command to verify that Docker can see the NVIDIA GPUs:

bash

复制代码

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

This command should display the nvidia-smi output from inside the container, showing the available GPUs.

Step 5: Run a Docker Container with GPU Access

You can now run Docker containers with access to NVIDIA GPUs. Use the --gpus option to specify GPU usage. For example:

bash

复制代码

docker run --gpus all -it --rm nvidia/cuda:11.0-base /bin/bash

This command runs an interactive session in a Docker container with CUDA 11.0, providing access to all GPUs on the host.

Additional Tips

Limiting GPU Access: If you need to limit the number of GPUs, you can specify the number explicitly:

bash

复制代码

docker run --gpus '"device=0"' -it --rm nvidia/cuda:11.0-base /bin/bash

This command gives the container access to only the first GPU (device=0).
Using Docker Compose: If you’re using Docker Compose, you can specify GPU access in your docker-compose.yml file as follows:

yaml

复制代码

version: '3.7' services: my_service: image: nvidia/cuda:11.0-base runtime: nvidia deploy: resources: reservations: devices: - capabilities: [gpu]

By following these steps, you should be able to pass NVIDIA GPUs through to your Docker containers, enabling you to run GPU-accelerated applications within Docker.

您说：

I want to use ubuntu:latest

ChatGPT 说：

ChatGPT

To use NVIDIA GPUs with a Docker container based on the ubuntu:latest image, follow these steps:

Prerequisites

Ensure the NVIDIA driver, Docker, and NVIDIA Container Toolkit are installed on your host machine as outlined in the previous steps.

Step-by-Step Guide

Install NVIDIA Driver: Ensure that the appropriate NVIDIA driver for your GPU is installed on your host machine.
Install Docker: If Docker is not already installed, follow the installation instructions on the Docker website.
Install NVIDIA Container Toolkit:

bash

复制代码

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker
Verify Installation:

bash

复制代码

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

Running Ubuntu with GPU Support

Pull the latest Ubuntu image:

bash

复制代码

docker pull ubuntu:latest
Run the Ubuntu container with GPU access:

bash

复制代码

docker run --gpus all -it --rm ubuntu:latest /bin/bash

Setting Up CUDA in the Ubuntu Container

If you need CUDA within the Ubuntu container, you will need to install it manually inside the container. Here’s how:

Run the Ubuntu container with GPU support:

bash

复制代码

docker run --gpus all -it ubuntu:latest /bin/bash
Inside the container, update the package list and install necessary packages:

bash

复制代码

apt-get update apt-get install -y wget build-essential
Download and install the CUDA toolkit:
- Visit the CUDA Toolkit Archive and copy the download link for the version you need.
- Download the CUDA toolkit. Replace <cuda-version> with the appropriate version number.
  
  bash
  
  复制代码
  
  wget https://developer.download.nvidia.com/compute/cuda/<cuda-version>/local_installers/cuda_<cuda-version>_linux.run
- Install the CUDA toolkit:
  
  bash
  
  复制代码
  
  sh cuda_<cuda-version>_linux.run
  
  Follow the prompts to complete the installation. You might need to deselect the driver installation if the driver is already installed on the host machine.
Verify CUDA installation:

bash

复制代码

nvidia-smi nvcc --version

Dockerfile for Reproducibility

For reproducibility, you can create a Dockerfile to automate the setup:

Dockerfile

复制代码

# Use the official Ubuntu image as a base FROM ubuntu:latest # Install necessary packages RUN apt-get update && apt-get install -y wget build-essential # Install CUDA (replace <cuda-version> with the appropriate version) RUN wget https://developer.download.nvidia.com/compute/cuda/<cuda-version>/local_installers/cuda_<cuda-version>_linux.run && \ sh cuda_<cuda-version>_linux.run --silent --toolkit && \ rm cuda_<cuda-version>_linux.run # Set environment variables for CUDA ENV PATH=/usr/local/cuda/bin:${PATH} ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH} # Verify CUDA installation RUN nvcc --version # Set the default command to bash CMD ["bash"]

Build the Docker image:

bash

复制代码

docker build -t my-ubuntu-cuda .

Run the container with GPU access:

bash

复制代码

docker run --gpus all -it --rm my-ubuntu-cuda

By following these steps, you can set up an Ubuntu Docker container with GPU support and, if necessary, CUDA installed.

您说：

Start from step 3 with cuda 11

ChatGPT 说：

ChatGPT

Starting from step 3, here is a detailed guide to install the NVIDIA Container Toolkit and set up a Docker container with Ubuntu and CUDA 11.

Step 3: Install NVIDIA Container Toolkit

Add the package repositories and the GPG key:

bash

复制代码

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Update the package list and install the NVIDIA Docker package:

bash

复制代码

sudo apt-get update sudo apt-get install -y nvidia-docker2
Restart Docker to load the new configuration:

bash

复制代码

sudo systemctl restart docker
Verify Installation:

Run the following command to ensure Docker can see the NVIDIA GPUs:

bash

复制代码

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

This command should display the nvidia-smi output from inside the container, showing the available GPUs.

Running Ubuntu with GPU Support and CUDA 11

Pull the latest Ubuntu image:

bash

复制代码

docker pull ubuntu:latest
Run the Ubuntu container with GPU access:

bash

复制代码

docker run --gpus all -it --rm ubuntu:latest /bin/bash

Installing CUDA 11 in the Ubuntu Container

Inside the running Ubuntu container, follow these steps:

Update the package list and install necessary packages:

bash

复制代码

apt-get update apt-get install -y wget build-essential
Download the CUDA 11 toolkit installer:

bash

复制代码

wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run
Run the CUDA 11 toolkit installer:

bash

复制代码

sh cuda_11.0.3_450.51.06_linux.run

Follow the prompts to complete the installation. You might need to deselect the driver installation if the driver is already installed on the host machine.
Set environment variables for CUDA:

Add the following lines to your ~/.bashrc file inside the container to set the environment variables:

bash

复制代码

echo 'export PATH=/usr/local/cuda-11.0/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc
Verify CUDA installation:

bash

复制代码

nvidia-smi nvcc --version

Dockerfile for Reproducibility

To automate this process, you can create a Dockerfile:

Dockerfile

复制代码

# Use the official Ubuntu image as a base FROM ubuntu:latest # Install necessary packages RUN apt-get update && apt-get install -y wget build-essential # Install CUDA 11 (replace with the appropriate version) RUN wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run && \ sh cuda_11.0.3_450.51.06_linux.run --silent --toolkit && \ rm cuda_11.0.3_450.51.06_linux.run # Set environment variables for CUDA ENV PATH=/usr/local/cuda-11.0/bin:${PATH} ENV LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:${LD_LIBRARY_PATH} # Verify CUDA installation RUN nvcc --version # Set the default command to bash CMD ["bash"]

Build the Docker image:

bash

复制代码

docker build -t ubuntu-cuda11 .

Run the container with GPU access:

bash

复制代码

docker run --gpus all -it --rm ubuntu-cuda11

By following these steps, you should have an Ubuntu Docker container with CUDA 11 installed and GPU support enabled.