Accelerating ML Inference at Scale with ONNX, Triton and Seldon | PyData Global 2021
Accelerating ML Inference at Scale with ONNX, Triton and Seldon
Speaker: Alejandro Saucedo
Summary
Identifying the right tools for high performant production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this session showcase how practitioners can optimize & productionise ML models in scalable ecosystems without having to deal with the underlying infrastructure. We’ll be optimizing a GPT-2 with ONNX and deploying to Triton using Seldon & Tempo.
Description
Identifying the right tools for high performant production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this session we aim to provide a hands-on guide on how practitioners can productionise optimized machine learning models in scalable ecosystems using production-ready open source tools & frameworks. We will dive into a practical use-case, deploying the renowned GPT-2 NLP machine learning model using the Tempo SDK, whi
4 views
47
16
3 months ago 00:01:01 1
120MB floppy disk from the 90s #shorts #vintagecomputer
8 months ago 00:01:48 1
Accelerating Data Center Design With Digital Twins