AWS re:Invent 2021 - Large-scale distributed training of media ML models with Amazon FSx
In this session, learn about the challenges of scalable distributed training of media machine learning models on multi-GPU nodes used by Netflix and how the Amazon FSx solution is used to resolve the data loader performance bottlenecks of the training system. See the impressive results in terms of performance and throughput improvements on multi-node GPUs and the scalability of Amazon FSx.
Learn more about re:Invent 2021 at
Subscribe:
More AWS videos
More AWS events videos
ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—a
3 views
82
33
4 months ago 00:13:03 1
The influence of Shakespeare on everyday English
7 months ago 00:03:39 1
AWS re:Invent 2016: Move Exabyte-Scale Data Sets with AWS Snowmobile
10 months ago 02:08:47 1
David Mamet | Club Random with Bill Maher
1 year ago 00:00:56 1
Re:Invent in Las Vegas (@AWSEventsChannel): The #BASF Case Study
1 year ago 00:02:25 1
ANYmal Demo at AWS re:Invent in Las Vegas
1 year ago 00:04:50 1
Siraj Raval - Deep Learning with 4th Gen Xeon Processors and Intel® Accelerator Engines (AWS re:Invent 2023)
1 year ago 00:00:41 1
AWS Re:Invent - Why Autonomy is important for industrial inspection robots like #ANYmal
1 year ago 00:04:35 1
Belle
1 year ago 00:03:04 1
Why We Build The Autonomous Industry - AWS re:Invent 2023