写在前面

*

  • 2
  • 3

Pass Nvidia GPUs Docker.

2024年7月8日
https://chatgpt.com/share/6537ce71-c89a-4d9b-ab3d-fc3582c1c559


Pass Nvidia GPUs Docker.

How do I pass nvidia gpus through to a docker instance?

ChatGPT 说:

ChatGPT

Passing NVIDIA GPUs through to a Docker instance requires a few steps, including installing the NVIDIA Container Toolkit. Here’s a step-by-step guide to help you set this up:

Prerequisites

  1. NVIDIA Driver: Ensure you have the NVIDIA driver installed on your host machine.
  2. Docker: Ensure Docker is installed on your host machine.
  3. NVIDIA Container Toolkit: This toolkit is necessary for Docker to recognize and use NVIDIA GPUs.

Step 1: Install NVIDIA Driver

Ensure that the appropriate NVIDIA driver for your GPU is installed on your host machine. You can usually install it from the NVIDIA website or through your Linux distribution’s package manager.

Step 2: Install Docker

If you don’t already have Docker installed, you can install it by following the instructions on the Docker website.

Step 3: Install NVIDIA Container Toolkit

  1. Add the package repositories and the GPG key:

    bash

    复制代码

    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

  2. Update and install the NVIDIA Docker package:

    bash

    复制代码

    sudo apt-get update sudo apt-get install -y nvidia-docker2

  3. Restart Docker to load the new configuration:

    bash

    复制代码

    sudo systemctl restart docker

Step 4: Verify Installation

Run the following command to verify that Docker can see the NVIDIA GPUs:

bash

复制代码

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

This command should display the nvidia-smi output from inside the container, showing the available GPUs.

Step 5: Run a Docker Container with GPU Access

You can now run Docker containers with access to NVIDIA GPUs. Use the --gpus option to specify GPU usage. For example:

bash

复制代码

docker run --gpus all -it --rm nvidia/cuda:11.0-base /bin/bash

This command runs an interactive session in a Docker container with CUDA 11.0, providing access to all GPUs on the host.

Additional Tips

  • Limiting GPU Access: If you need to limit the number of GPUs, you can specify the number explicitly:

    bash

    复制代码

    docker run --gpus '"device=0"' -it --rm nvidia/cuda:11.0-base /bin/bash

    This command gives the container access to only the first GPU (device=0).

  • Using Docker Compose: If you’re using Docker Compose, you can specify GPU access in your docker-compose.yml file as follows:

    yaml

    复制代码

    version: '3.7' services: my_service: image: nvidia/cuda:11.0-base runtime: nvidia deploy: resources: reservations: devices: - capabilities: [gpu]

By following these steps, you should be able to pass NVIDIA GPUs through to your Docker containers, enabling you to run GPU-accelerated applications within Docker.

您说:

I want to use ubuntu:latest

ChatGPT 说:

ChatGPT

To use NVIDIA GPUs with a Docker container based on the ubuntu:latest image, follow these steps:

Prerequisites

Ensure the NVIDIA driver, Docker, and NVIDIA Container Toolkit are installed on your host machine as outlined in the previous steps.

Step-by-Step Guide

  1. Install NVIDIA Driver: Ensure that the appropriate NVIDIA driver for your GPU is installed on your host machine.

  2. Install Docker: If Docker is not already installed, follow the installation instructions on the Docker website.

  3. Install NVIDIA Container Toolkit:

    bash

    复制代码

    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker

  4. Verify Installation:

    bash

    复制代码

    docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

Running Ubuntu with GPU Support

  1. Pull the latest Ubuntu image:

    bash

    复制代码

    docker pull ubuntu:latest

  2. Run the Ubuntu container with GPU access:

    bash

    复制代码

    docker run --gpus all -it --rm ubuntu:latest /bin/bash

Setting Up CUDA in the Ubuntu Container

If you need CUDA within the Ubuntu container, you will need to install it manually inside the container. Here’s how:

  1. Run the Ubuntu container with GPU support:

    bash

    复制代码

    docker run --gpus all -it ubuntu:latest /bin/bash

  2. Inside the container, update the package list and install necessary packages:

    bash

    复制代码

    apt-get update apt-get install -y wget build-essential

  3. Download and install the CUDA toolkit:

    • Visit the CUDA Toolkit Archive and copy the download link for the version you need.

    • Download the CUDA toolkit. Replace <cuda-version> with the appropriate version number.

      bash

      复制代码

      wget https://developer.download.nvidia.com/compute/cuda/<cuda-version>/local_installers/cuda_<cuda-version>_linux.run

    • Install the CUDA toolkit:

      bash

      复制代码

      sh cuda_<cuda-version>_linux.run

      Follow the prompts to complete the installation. You might need to deselect the driver installation if the driver is already installed on the host machine.

  4. Verify CUDA installation:

    bash

    复制代码

    nvidia-smi nvcc --version

Dockerfile for Reproducibility

For reproducibility, you can create a Dockerfile to automate the setup:

Dockerfile

复制代码

# Use the official Ubuntu image as a base FROM ubuntu:latest # Install necessary packages RUN apt-get update && apt-get install -y wget build-essential # Install CUDA (replace <cuda-version> with the appropriate version) RUN wget https://developer.download.nvidia.com/compute/cuda/<cuda-version>/local_installers/cuda_<cuda-version>_linux.run && \ sh cuda_<cuda-version>_linux.run --silent --toolkit && \ rm cuda_<cuda-version>_linux.run # Set environment variables for CUDA ENV PATH=/usr/local/cuda/bin:${PATH} ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH} # Verify CUDA installation RUN nvcc --version # Set the default command to bash CMD ["bash"]

Build the Docker image:

bash

复制代码

docker build -t my-ubuntu-cuda .

Run the container with GPU access:

bash

复制代码

docker run --gpus all -it --rm my-ubuntu-cuda

By following these steps, you can set up an Ubuntu Docker container with GPU support and, if necessary, CUDA installed.

您说:

Start from step 3 with cuda 11

ChatGPT 说:

ChatGPT

Starting from step 3, here is a detailed guide to install the NVIDIA Container Toolkit and set up a Docker container with Ubuntu and CUDA 11.

Step 3: Install NVIDIA Container Toolkit

  1. Add the package repositories and the GPG key:

    bash

    复制代码

    distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

  2. Update the package list and install the NVIDIA Docker package:

    bash

    复制代码

    sudo apt-get update sudo apt-get install -y nvidia-docker2

  3. Restart Docker to load the new configuration:

    bash

    复制代码

    sudo systemctl restart docker

  4. Verify Installation:

    Run the following command to ensure Docker can see the NVIDIA GPUs:

    bash

    复制代码

    docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

    This command should display the nvidia-smi output from inside the container, showing the available GPUs.

Running Ubuntu with GPU Support and CUDA 11

  1. Pull the latest Ubuntu image:

    bash

    复制代码

    docker pull ubuntu:latest

  2. Run the Ubuntu container with GPU access:

    bash

    复制代码

    docker run --gpus all -it --rm ubuntu:latest /bin/bash

Installing CUDA 11 in the Ubuntu Container

Inside the running Ubuntu container, follow these steps:

  1. Update the package list and install necessary packages:

    bash

    复制代码

    apt-get update apt-get install -y wget build-essential

  2. Download the CUDA 11 toolkit installer:

    bash

    复制代码

    wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run

  3. Run the CUDA 11 toolkit installer:

    bash

    复制代码

    sh cuda_11.0.3_450.51.06_linux.run

    Follow the prompts to complete the installation. You might need to deselect the driver installation if the driver is already installed on the host machine.

  4. Set environment variables for CUDA:

    Add the following lines to your ~/.bashrc file inside the container to set the environment variables:

    bash

    复制代码

    echo 'export PATH=/usr/local/cuda-11.0/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc

  5. Verify CUDA installation:

    bash

    复制代码

    nvidia-smi nvcc --version

Dockerfile for Reproducibility

To automate this process, you can create a Dockerfile:

Dockerfile

复制代码

# Use the official Ubuntu image as a base FROM ubuntu:latest # Install necessary packages RUN apt-get update && apt-get install -y wget build-essential # Install CUDA 11 (replace with the appropriate version) RUN wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run && \ sh cuda_11.0.3_450.51.06_linux.run --silent --toolkit && \ rm cuda_11.0.3_450.51.06_linux.run # Set environment variables for CUDA ENV PATH=/usr/local/cuda-11.0/bin:${PATH} ENV LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:${LD_LIBRARY_PATH} # Verify CUDA installation RUN nvcc --version # Set the default command to bash CMD ["bash"]

Build the Docker image:

bash

复制代码

docker build -t ubuntu-cuda11 .

Run the container with GPU access:

bash

复制代码

docker run --gpus all -it --rm ubuntu-cuda11

By following these steps, you should have an Ubuntu Docker container with CUDA 11 installed and GPU support enabled.