This training aims to give Apache Spark training using the R API. This is part 2 in a series of 2 courses.
In the second Apache Spark training you will be introduced to Machine Learning concepts with Spark’s MLlib API as well as how to apply them at scale. During the practical sessions participants will work on a churning model based on customer data, a model for income prediction based on people data, a classification model for detecting spam in SMS text messages and a joke recommendation system using ALS.
The training includes theory, demos, and hands-on exercises.
After this training you will have gained knowledge about:
- Apache Arrow and UDFs with Sparklyr
- Basic machine learning concepts
- Spark MLlib
- Pipelines in Spark
- Using Spark and machine learning for predictions