Saturday, 30 December 2017

Apache Spark is now integrated with Apache Hive 2.3.3, Apache Spark 2.2.1 and Amazon SageMaker on Amazon EMR release 5.11.0

Apache Spark 2.2.1, Apache Hive 2.3.2 and Amazon SageMaker can now be utilized with an integration with Apache Spark on the Amazon EMR releases 5.11.0. Apache Hive 2.3.2 and Apache Spark 2.2.1 now include several improvements and bug fixes. Amazon SageMaker Apache Spark is an open-source Spark library for the Amazon SageMaker which is a fully managed service which can deploy, build and train machine learning models at scale. It will allow interleaving Spark stages and stages that collaborates with Amazon SageMaker in the Apache Spark ML Pipelines that will enable to train models by utilizing the Spark DataFrames in the Amazon SageMaker with the Amazon delivered ML algorithms like XGBoost and K-Means clustering. Amazon EMR cluster can be created with release 5.11.0 by selecting release label “emr 5.11.0” from the Amazon Web Service Management Console, SDK or Amazon Web Service CLI. You can choose Apache Hive and Apache Spark to install such applications on the cluster. Amazon SageMaker Apache Spark Library is already integrated with when you install Spark. 

No comments:

Post a Comment

AWS’s growth story

AWS’s growth story - numbers don't lie! AWS took 123 months, a little over 10 years, to grow to a $10 billion business. Then took only 2...