
- Sponsor
- NCSA, Illinois Computes
- Speaker
- Soham Pal
- Registration
- Registration
- Contact
- Soham Pal
- soham@illinois.edu
- Originating Calendar
- NCSA Research Consulting Training Events
Instructors: Soham Pal
Date/Time: Wednesday, April 15, 2026
Time: 1:00 PM - 3:00 PM Central Time
Location: Zoom (Zoom coordinates will be provided to registrants before the workshop.)Abstract: PyTorch Lightning provides streamlined libraries like Fabric and Trainer for efficient multi-GPU training, enabling AI practitioners to scale their deep learning applications with ease. This workshop will delve into advanced techniques for training models across multiple GPUs, discussing essential strategies to optimize convergence and performance. Attendees will also learn to leverage TensorBoard for comprehensive experiment tracking and metric visualization, ensuring robust monitoring throughout the training lifecycle.
The session will also address best practices for model persistence and resilience, including methods for saving intermittent checkpoints and seamlessly resuming training from interruptions. By mastering these techniques, participants will enhance their ability to manage large-scale, distributed training workflows, reducing computational overhead and accelerating development cycles. This hands-on exploration is designed to equip researchers with practical skills to scale their deep learning workflows.Prerequisites: Prior experience with PyTorch fundamentals will be useful.
Hands-on participation: The workshop will use NCSA's Delta cluster for hands-on demonstrations. You will need to an ACCESS account to participate in the hands-on exercises on Delta. The account is free and can be set up on the ACCESS account registration page. Please set up your account before completing this registration form.
Register by April 8, 2026