Sparkour is an open-source collection of programming recipes for Apache Spark. Designed as an efficient way to navigate the intricacies of the Spark ecosystem, Sparkour aims to be an approachable, understandable, and actionable cookbook for distributed data processing.

Sparkour delivers extended tutorials for developers new to Spark as well as shorter, standalone recipes that address common developer needs in Java, Python, R, and Scala. The entire trove is licensed under the Apache License 2.0.

What's New?

2019-01-22 Tutorial #4: Writing and Submitting a Spark Application has been updated with instructions for installing Python 3. All Python recipes have been tested against Python 3.6.7.
2019-01-06 Happy New Year! All recipes have been updated and tested against Spark 2.4.0. I have also incorporated some behind-the-scenes automation to streamline regression testing and make it easier for me to stay in sync with future Spark releases.
2018-05-27 All recipes have been updated and tested against Spark 2.3.0 and Scala 2.11.12.
2017-08-05 All recipes have been updated and tested against Spark 2.2.0.

About the Author

Brian Uri! is a solutions architect at the advanced analytics company, Novetta. He has over 15 years of experience in software engineering, proposal writing, and government data standards, with relevant certifications in Amazon Web Services and Apache Hadoop.

Sparkour was conceived in February 2016 as a way for Brian to learn Apache Spark while scratching an itch to create more open-source software. Brian is also the creator of the open-source library, DDMSence.

Apache, Spark, and Apache Spark are trademarks of the Apache Software Foundation (ASF).
Sparkour is © 2016 - 2019 by It is an independent project that is not endorsed or supported by Novetta or the ASF.
visitors since February 2016