In deep learning applications, inference accounts for up to 90% of compute cost. To reduce this high inference cost, you can use Amazon Elastic Inference, which allows you to attach just the right amount of GPU-powered inference acceleration to any EC2 or SageMaker instance type or ECS task. In this tech talk, you will learn about how to use Elastic Inference for deploying models built on PyTorch, a popular machine learning framework.

Learning Objectives:
*Get an overview of Amazon Elastic Inference
*Learn about how to use Elastic Inference to reduce costs and improve latency for your PyTorch models on Amazon SageMaker
*Get an demo using TorchScript with Elastic Inference API

***To learn more about the services featured in this talk, please visit:

Categories: Videos