Big Data with Pyspark Traineeship is a self paced 4 week Industrial Training program with mentorship support.
Program Description
Advance your data skills by mastering Apache Spark. Using the Spark Python API, PySpark, you will leverage parallel computation with large datasets, and get ready for high-performance machine learning. From cleaning data to creating features and implementing machine learning models, you’ll execute end-to-end workflows with Spark. The program ends with building a recommendation engine using the popular MovieLens dataset and the Million Songs dataset.
⇩ What we cover in 4 weeks.
Week 1 – 3
Introduction to PySpark
Big Data Fundamentals with PySpark
Cleaning Data with PySpark
Feature Engineering with PySpark
Machine Learning with PySpark
Building Recommendation Engines with PySpark
Week 4
Capstone Project
Program Benefits
Certificate of Completion
Letter of Recommendation
Complete this course while you work
Rigorous curriculum designed by industry experts
Best performers will also be offered a job within the company.