Deploying Cost-Effective Deep Learning Inference – AWS Online Tech Talks
In most deep learning applications, inference can drive as much as 90% of the compute costs of the application. GPU instances can be oversized and thus, expensive for inference, while CPU instances may not be performant enough. In this tech talk, we discuss how Amazon Elastic Inference solves this problem. You will learn how to lower your inference costs by up to 75% by using Elastic Inference with TensorFlow and Apache MXNet deep learning frameworks.
– Learn about key use cases that require GPU-powered acceleration for inference
– Understand how Amazon Elastic Inference works
– See how to use Elastic Inference on Amazon SageMaker, Amazon EC2, and Amazon ECS