Large Language Model Distributed Training Workshop

23.4. Large Language Model Distributed Training Workshop#

23.4.1. Workshop Summary#

This workshop highlights parallelization techniques for training large language models. The workshop covers techniques such as Distributed Data Parallelism (DDP), Model Parallelism (MP), Tensor Parallelism (TP), Pipeline Parallelism (PP), and Fully Sharded Data Parallelism (FSDP). In addition to reviewing the advantages of each technique and their use cases, this workshop provides hands-on examples to help with understanding LLM distributed training approaches.

23.4.1.1. Prerequisites#

  • Familiarity with PyTorch framework and Python programming

  • Familiarity with LLMs

  • Familiarity with High Performance Computing (HPC) cluster

23.4.2. Workshop Slides#

To download the “Large Language Model Distributed Training” workshop slides, click the link below.

KempnerLLM Distributed Training Workshop