Sparkour

Sparkour

Sparkour is an open-source collection of programming recipes for Apache Spark. Designed as an efficient way to navigate the intricacies of the Spark ecosystem, Sparkour aims to be an approachable, understandable, and actionable cookbook for distributed data processing.

Sparkour delivers extended tutorials for developers new to Spark as well as shorter, standalone recipes that address common developer needs in Java, Python, R, and Scala. The entire trove is licensed under the Apache License 2.0.

What's New?

2017-08-05 All recipes have been updated and tested against Spark 2.2.0.
2017-05-29 All recipes have been updated and tested against Spark 2.1.1 and Scala 2.11.11.
2016-10-09 All recipes have been updated and tested against Spark 2.0.1.
2016-09-24 Understanding the SparkSession in Spark 2.0 introduces the new SparkSession class from Spark 2.0, which provides a unified entry point for all of the various Context classes previously found in Spark 1.x.
2016-09-22 Installing and Configuring Apache Zeppelin explains how to install Apache Zeppelin and configure it to work with Spark. Interactive notebooks such as Zeppelin make it easier for analysts (who may not be software developers) to harness the power of Spark through iterative exploration and built-in visualizations.

About the Author

Brian Uri! is a solutions architect at the advanced analytics company, Novetta. He has over a decade of experience in software development and government data standards, with relevant certifications in Apache Hadoop and Amazon Web Services.

Sparkour was conceived in February 2016 as a way for Brian to learn Apache Spark while scratching an itch to create more open-source software. Brian is also the creator of the open-source library, DDMSence.

Apache, Spark, and Apache Spark are trademarks of the Apache Software Foundation (ASF).
Sparkour is © 2016 - 2017 by It is an independent project that is not endorsed or supported by Novetta or the ASF.
visitors since February 2016