Big Data Meets Machine Learning: Architecting Spark Environment for Data Scientists
In this code-level session, we show you how to integrate your Apache Spark application with Amazon SageMaker. We’ll include details on how networking, architecture, and code execute across the components of the solution. We will also dive deep into how to perform data exploration and feature engineering on the Spark cluster, starting training jobs from Spark, integrating training jobs in Spark pipelines, and more. Amazon SageMaker, our fully managed machine learning platform, comes with pre-built algorithms and popular deep learning frameworks. Amazon SageMaker also includes an Apache Spark library that you can use to easily train models from your Spark clusters.

View on YouTube