GPU Programming - Introduction to CUDA
Event description
Wed, 17 Sep 2025, 1 pm - 5 pm
Thur, 18 Sep 2025, 9 am - 1 pm
Introduction to CUDA programming using C covers the fundamentals of parallel programming with NVIDIA’s CUDA platform, including concepts such as GPU architecture, memory management, kernel functions, and performance optimisation. The materials are designed for beginners and include step-by-step tutorials, practical examples, and exercises to help you get started with writing and running CUDA programs in C.
Not sure if this workshop is right for you?
See the Common HPC and Accelerator Tools section below for more information.
Prerequisites
Basic experience with C and heap memory is required.
Basic experience with Unix Shell and Git.
Basic experience with text editors such as vim, emacs or nano.
Learning Outcomes
At the completion of this training session, you will be able to
Write and compile basic CUDA programs.
Understand the execution model of CUDA-enabled GPUs.
Manage device and host memory.
Optimize code for better performance.
Topics Covered
GPU Overview
GPU Execution Model
GPU Workflow
Asynchronous CUDA Calls
Shared Memory
CUDA Events
Unified Memory
Common HPC and Accelerator Tools
Tool | Category | Language/API | Parallelism Type | Target Hardware | Typical Use Case |
---|---|---|---|---|---|
CuPy | Python library | Python (NumPy-compatible API) | Data-parallel GPU | NVIDIA GPUs | GPU-accelerated array and matrix operations as a drop-in replacement for NumPy |
CUDA | GPU programming platform | C/C++ API (also Python via PyCUDA, Numba), Fortran | Data-parallel GPU | NVIDIA GPUs | Writing custom GPU kernels and fine-grained GPU code |
OpenACC | Compiler directives | C/C++, Fortran pragmas | Data-parallel GPU/CPU | GPUs & other accelerators | Annotating loops to offload work to accelerators |
OpenMP | Compiler directives | C/C++, Fortran pragmas | Shared-memory CPU | Multi-core CPUs, GPUs | Parallelizing loops and regions on a single node |
MPI | Library & standard | C/C++, Fortran, Python (via mpi4py) | Distributed-memory | Clusters & networks | Decomposing work across processes/nodes with message passing |
Tickets for good, not greed Humanitix dedicates 100% of profits from booking fees to charity