Big Data Trainee

Big Data with Pyspark Traineeship is a self paced 4 week Industrial Training program with mentorship support.

Program Description

Advance your data skills by mastering Apache Spark. Using the Spark Python API, PySpark, you will leverage parallel computation with large datasets, and get ready for high-performance machine learning. From cleaning data to creating features and implementing machine learning models, you’ll execute end-to-end workflows with Spark. The program ends with building a recommendation engine using the popular MovieLens dataset and the Million Songs dataset.

What we cover in 4 weeks.

Week 1 – 3

Introduction to PySpark

Big Data Fundamentals with PySpark

Cleaning Data with PySpark

Feature Engineering with PySpark

Machine Learning with PySpark

Building Recommendation Engines with PySpark

Week 4

Capstone Project

Program Benefits

Certificate of Completion

Letter of Recommendation

Complete this course while you work

Rigorous curriculum designed by industry experts

Best performers will also be offered a job within the company.

Job Category: Data Analysis Data Science
Job Type: Traineeship
Job Location: Remote

Apply for this position

Allowed Type(s): .pdf, .doc, .docx