23.3. Introduction to Distributed Computing Workshop#
23.3.1. Workshop Summary#
This workshop covers the types of computational problems that distributed computing can address and key forms of communication between computing servers. The focus is on two widely-used forms of distributed computing: embarrassingly parallel processes, useful for tasks like hyperparameter sweeps, and Distributed Data Parallel (DDP) processes, which facilitate training machine learning models across multiple GPUs.
As part of the Workshops @ Kempner series, this interactive workshop provides a practical introduction to distributed computing in research settings. Topics include:
What distributed computing is and when to use it
Forms of communication between distributed systems
Embarrassingly parallel processes using SLURM array jobs (e.g., for hyperparameter sweeps)
Distributed Data Parallel (DDP) in PyTorch for training multi-layer perceptrons across multiple GPUs
23.3.1.1. Prerequisites#
Basic knowledge of SLURM
Familiarity with multi-layer perceptrons
PyTorch and backpropagation knowledge is helpful but not required
23.3.2. Workshop Slides#
To view the “Introduction to Distributed Computing” workshop slides, click the following link:
Introduction to Distributed Computing Workshop