GPU Programming - Introduction to CUDA

Name: GPU Programming - Introduction to CUDA
Start: 2025-09-16T23:00:00-0400
End: 2025-09-17T23:00:00-0400

Tue, 16 Sep, 11pm - Wed, 17 Sep, 11pm EDT

Online Event

National Computational Infrastructure

287 followers · Contact host

Event description

Wed, 17 Sep 2025, 1 pm - 5 pm

Thur, 18 Sep 2025, 9 am - 1 pm

Introduction to CUDA programming using C covers the fundamentals of parallel programming with NVIDIA’s CUDA platform, including concepts such as GPU architecture, memory management, kernel functions, and performance optimisation. The materials are designed for beginners and include step-by-step tutorials, practical examples, and exercises to help you get started with writing and running CUDA programs in C.

Not sure if this workshop is right for you?

See the Common HPC and Accelerator Tools section below for more information.

Prerequisites

Basic experience with C and heap memory is required.
Basic experience with Unix Shell and Git.
Basic experience with text editors such as vim, emacs or nano.

Learning Outcomes

At the completion of this training session, you will be able to

Write and compile basic CUDA programs.
Understand the execution model of CUDA-enabled GPUs.
Manage device and host memory.
Optimize code for better performance.

Topics Covered

GPU Overview
GPU Execution Model
GPU Workflow
Asynchronous CUDA Calls
Shared Memory
CUDA Events
Unified Memory

Common HPC and Accelerator Tools

Tool	Category	Language/API	Parallelism Type	Target Hardware	Typical Use Case
CuPy	Python library	Python (NumPy-compatible API)	Data-parallel GPU	NVIDIA GPUs	GPU-accelerated array and matrix operations as a drop-in replacement for NumPy
CUDA	GPU programming platform	C/C++ API (also Python via PyCUDA, Numba), Fortran	Data-parallel GPU	NVIDIA GPUs	Writing custom GPU kernels and fine-grained GPU code
OpenACC	Compiler directives	C/C++, Fortran pragmas	Data-parallel GPU/CPU	GPUs & other accelerators	Annotating loops to offload work to accelerators
OpenMP	Compiler directives	C/C++, Fortran pragmas	Shared-memory CPU	Multi-core CPUs, GPUs	Parallelizing loops and regions on a single node
MPI	Library & standard	C/C++, Fortran, Python (via mpi4py)	Distributed-memory	Clusters & networks	Decomposing work across processes/nodes with message passing

Tickets for good, not greed Humanitix dedicates 100% of profits from booking fees to charity

Tue, 16 Sep, 11pm - Wed, 17 Sep, 11pm EDT

Online Event

Hosted by National Computational Infrastructure