Alexa: Which is the Best Instance to Run Machine Learning Inference? - AWS Online Tech Talks

Amazon Alexa and Rekognition services are based on machine learning, and process millions of requests every second. By switching to AWS Inferentia-based Amazon EC2 Inf1 instances from GPU-based instances for machine learning inference, these services saved 45% of their inference costs while boosting performance. If you are a developer or business building machine learning capabilities into your applications to run at scale, this tech talk will take you inside Alexa’s and Rekognition’s architecture and show

1 view