Duration: 3h 27m | Video: .MP4, 1920x1080 30 fps | Audio: AAC, 44.1 kHz, 2ch | Size: 1.29 GB
Genre: eLearning | Language: English
Learn analyzing large data sets with Apache Spark by 10+ hands-on examples. Take your big data skills to the next level.
What you'll learn
An overview of the architecture of Apache Spark.
Work with Apache Spark's primary abstraction, resilient distributed datasets(RDDs) to process and analyze large data sets.
Develop Apache Spark 2.0 applications using RDD transformations and actions and Spark SQL.
Scale up Spark applications on a Hadoop YARN cluster through Amazon's Elastic MapReduce service.
Analyze structured and semi-structured data using Datasets and DataFrames, and develop a thorough understanding about Spark SQL.
Share information across different nodes on a Apache Spark cluster by broadcast variables and accumulators.
Advanced techniques to optimize and tune Apache Spark jobs by partitioning, caching and persisting RDDs.
Best practices of working with Apache Spark in the field.
Requirements
A computer running Windows, OSX or Linux
Previous Java programming skills
Java 8 experience is preferred but NOT required
Description
What is this course about
This course covers all the fundamentals about Apache Spark with Java and teaches you everything you need to know about developing Spark applications with Java. At the end of this course, you will gain in-depth knowledge about Apache Spark and general big data analysis and manipulations skills to help your company to adapt Apache Spark for building big data processing pipeline and data analytics applications.