More dates

Payment plans

How does it work?

  • Reserve your order today and pay over time in regular, automatic payments.
  • You’ll receive your tickets and items once the final payment is complete.
  • No credit checks or third-party accounts - just simple, secure, automatic payments using your saved card.

GPU Programming - Introduction to CUDA

Share
Online Event
Add to calendar

Tue, 16 Sep, 11pm - Wed, 17 Sep, 11pm EDT

Event description

Wed, 17 Sep 2025, 1 pm - 5 pm

Thur, 18 Sep 2025, 9 am - 1 pm

Introduction to CUDA programming using C covers the fundamentals of parallel programming with NVIDIA’s CUDA platform, including concepts such as GPU architecture, memory management, kernel functions, and performance optimisation. The materials are designed for beginners and include step-by-step tutorials, practical examples, and exercises to help you get started with writing and running CUDA programs in C.

Not sure if this workshop is right for you?

See the Common HPC and Accelerator Tools section below for more information.

Prerequisites

  1. Basic experience with C and heap memory is required.

  2. Basic experience with Unix Shell and Git.

  3. Basic experience with text editors such as vim, emacs or nano.

Learning Outcomes

At the completion of this training session, you will be able to

  • Write and compile basic CUDA programs.

  • Understand the execution model of CUDA-enabled GPUs.

  • Manage device and host memory.

  • Optimize code for better performance.

Topics Covered

  • GPU Overview

  • GPU Execution Model

  • GPU Workflow

  • Asynchronous CUDA Calls

  • Shared Memory

  • CUDA Events

  • Unified Memory

Common HPC and Accelerator Tools

Tool

Category

Language/API

Parallelism Type

Target Hardware

Typical Use Case

CuPy

Python library

Python (NumPy-compatible API)

Data-parallel GPU

NVIDIA GPUs

GPU-accelerated array and matrix operations as a drop-in replacement for NumPy

CUDA

GPU programming platform

C/C++ API (also Python via PyCUDA, Numba), Fortran

Data-parallel GPU

NVIDIA GPUs

Writing custom GPU kernels and fine-grained GPU code

OpenACC

Compiler directives

C/C++, Fortran pragmas

Data-parallel GPU/CPU

GPUs & other accelerators

Annotating loops to offload work to accelerators

OpenMP

Compiler directives

C/C++, Fortran pragmas

Shared-memory CPU

Multi-core CPUs, GPUs

Parallelizing loops and regions on a single node

MPI

Library & standard

C/C++, Fortran, Python (via mpi4py)

Distributed-memory

Clusters & networks

Decomposing work across processes/nodes with message passing

Powered by

Tickets for good, not greed Humanitix dedicates 100% of profits from booking fees to charity

Online Event