Sparkour

Sparkour

Sparkour is an open-source collection of programming recipes for Apache Spark. Designed as an efficient way to navigate the intricacies of the Spark ecosystem, Sparkour aims to be an approachable, understandable, and actionable cookbook for distributed data processing.

Sparkour delivers extended tutorials for developers new to Spark as well as shorter, standalone recipes that address common developer needs in Java, Python, R, and Scala. The entire trove is licensed under the Apache License 2.0.

What's New?

2020-03-07 All recipes have been updated and tested against Spark 2.4.5.
2019-10-20 Configuring Amazon S3 as a Spark Data Source and Configuring Spark to Use Amazon S3 have been updated to reflect the deprecation of the s3n protocol in favor of s3a.
2019-10-19 All recipes have been updated and tested against Spark 2.4.4.
2019-05-30 All recipes have been updated and tested against Spark 2.4.3.

About the Author

Brian Uri! is a solutions architect at the advanced analytics company, Novetta. He has over 15 years of experience in software engineering, proposal writing, and government data standards, with relevant certifications in Amazon Web Services and Apache Hadoop.

Sparkour was conceived in February 2016 as a way for Brian to learn Apache Spark while scratching an itch to create more open-source software. Brian is also the creator of the open-source library, DDMSence.

Apache, Spark, and Apache Spark are trademarks of the Apache Software Foundation (ASF).
Sparkour is © 2016 - 2020 by It is an independent project that is not endorsed or supported by Novetta or the ASF.
visitors since February 2016