Getting started with Python on Spark

At my current project I work a lot with Apache Spark and running PySpark jobs on it. For those who also want to get their hands dirty with Spark in combination with Python I can recommend this course at Udemy. It gives a broad introduction about Spark in general, the different modules like ‘Spark Streaming‘, ‘Spark Sql‘, ‘MLlib‘ and ‘GraphX‘, and how to use Python to make use of the Spark system.
It also explains how you can run your own cluster in the cloud with AWS EMR about which I wrote several post before. And no worries, after completing this course there is still lots more to discover about Spark 😉 but like I said this should be sufficient to get started.

About Pascal Alma

Pascal is a senior IT consultant and has been working in IT since 1997. He is monitoring the latest development in new technologies (Mobile, Cloud, Big Data) closely and particularly interested in Java open source tool stacks, cloud related technologies like AWS and mobile development like building iOS apps with Swift. Specialties: Java/JEE/Spring Amazon AWS API/REST Big Data Continuous Delivery Swift/iOS
This entry was posted in Spark and tagged , . Bookmark the permalink.