• training@skillsforafrica.org
    info@skillsforafrica.org

Data Engineering With Cloud Platforms (snowflake, Databricks): Building Scalable Data Pipelines

Introduction:

Modern data engineering demands scalable and reliable pipelines for processing vast amounts of data. This course on Data Engineering with Cloud Platforms (Snowflake, Databricks) equips participants with the specialized knowledge and skills to build robust data pipelines in cloud environments. Participants will learn how to leverage Snowflake and Databricks for data warehousing, ETL/ELT, and advanced analytics. This course bridges the gap between traditional data engineering and cloud-native data processing, empowering professionals to create efficient and scalable data solutions.

Target Audience:

This course is designed for data professionals seeking to build and manage data pipelines in cloud environments, including:

  • Data Engineers
  • Data Architects
  • ETL/ELT Developers
  • Cloud Engineers
  • Data Scientists
  • Anyone involved in building and maintaining data pipelines

Course Objectives:

Upon completion of this Data Engineering with Cloud Platforms course, participants will be able to:

  • Understand the architecture and capabilities of Snowflake and Databricks.
  • Design and build scalable data pipelines in cloud environments.
  • Utilize Snowflake for data warehousing and analytics.
  • Utilize Databricks for data engineering and machine learning.
  • Implement ETL/ELT processes using cloud-native tools.
  • Understand data governance and data quality management.
  • Implement data security and access control in cloud data platforms.
  • Optimize data pipelines for performance and cost.
  • Understand data lakehouse architectures.
  • Implement data integration and data transformation.
  • Enhance their ability to build and manage scalable data pipelines.
  • Improve their organization's data engineering practices.
  • Contribute to improved data availability and data-driven decision-making.
  • Stay up-to-date with the latest trends and best practices in cloud data engineering.
  • Become a more knowledgeable and effective data engineer.
  • Understand ethical considerations in data pipeline development.
  • Learn how to use Snowflake and Databricks tools and platforms effectively.

DURATION

10 Days

COURSE CONTENT

Module 1: Introduction to Modern Data Engineering and Cloud Platforms

  • Understanding the evolution of data engineering and the need for cloud platforms.
  • Overview of Snowflake and Databricks architectures and capabilities.
  • Comparing Snowflake and Databricks: strengths and use cases.
  • Understanding data lakehouse architectures and their benefits.
  • Setting up development environments for Snowflake and Databricks.

Module 2: Snowflake Fundamentals and Data Warehousing

  • Understanding Snowflake's architecture and storage model.
  • Creating and managing Snowflake databases, schemas, and tables.
  • Loading and unloading data into Snowflake.
  • Utilizing Snowflake's virtual warehouses for compute.
  • Understanding Snowflake's data sharing and cloning capabilities.

Module 3: Databricks Fundamentals and Data Processing

  • Understanding Databricks' architecture and workspace.
  • Working with Databricks notebooks and clusters.
  • Utilizing Apache Spark in Databricks for data processing.
  • Understanding Databricks Delta Lake and its benefits.
  • Managing Databricks libraries and dependencies.

Module 4: Data Ingestion and ETL/ELT with Snowflake

  • Utilizing Snowflake's Snowpipe for continuous data ingestion.
  • Implementing ETL/ELT processes using Snowflake SQL.
  • Understanding data transformations and aggregations in Snowflake.
  • Utilizing Snowflake tasks for automated data processing.
  • Implementing data quality checks in Snowflake.

Module 5: Data Ingestion and ETL/ELT with Databricks

  • Utilizing Databricks Auto Loader for incremental data ingestion.
  • Implementing ETL/ELT processes using Spark DataFrames and Delta Lake.
  • Understanding data transformations and aggregations in Databricks.
  • Utilizing Databricks workflows and jobs for automated data pipelines.
  • Implementing data quality checks in Databricks.

Module 6: Data Transformation and Data Modeling

  • Understanding data transformation techniques and best practices.
  • Implementing data cleansing, normalization, and enrichment.
  • Utilizing data modeling techniques for analytical workloads.
  • Understanding dimensional modeling and star/snowflake schemas.
  • Implementing data modeling in Snowflake and Databricks.

Module 7: Data Governance and Data Quality Management

  • Understanding data governance principles and practices.
  • Implementing data cataloging and metadata management.
  • Utilizing data lineage and data profiling tools.
  • Implementing data quality rules and checks.
  • Understanding data security and access control.

Module 8: Data Security and Access Control in Snowflake

  • Understanding Snowflake's security model and features.
  • Implementing role-based access control (RBAC) in Snowflake.
  • Utilizing Snowflake's data masking and row-level security.
  • Implementing data encryption and key management.
  • Understanding Snowflake's compliance and auditing capabilities.

Module 9: Data Security and Access Control in Databricks

  • Understanding Databricks' security model and features.
  • Implementing Unity Catalog for data governance and security.
  • Utilizing Databricks' access control lists (ACLs) and permissions.
  • Implementing data encryption and key management.
  • Understanding Databricks' compliance and auditing capabilities.

Module 10: Data Pipeline Performance Tuning and Optimization

  • Understanding performance tuning techniques for Snowflake and Databricks.
  • Optimizing queries and data processing in Snowflake.
  • Optimizing Spark jobs and clusters in Databricks.
  • Utilizing caching and data partitioning strategies.
  • Monitoring and troubleshooting data pipeline performance.

Module 11: Data Integration and Data Lakehouse Architecture

  • Understanding data integration patterns and techniques.
  • Implementing data integration with external systems and APIs.
  • Utilizing data lakehouse architectures for unified data management.
  • Understanding Delta Lake for building reliable data lakes.
  • Implementing data lakehouse use cases with Snowflake and Databricks.

Module 12: Real-Time Data Processing and Streaming

  • Understanding real-time data processing concepts and technologies.
  • Utilizing Snowflake's Snowpipe Streaming for real-time data ingestion.
  • Utilizing Databricks Structured Streaming for real-time data processing.
  • Implementing real-time data analytics and dashboards.
  • Understanding event-driven data architectures.

Module 13: Machine Learning Integration with Databricks

  • Understanding Databricks' machine learning capabilities.
  • Utilizing MLflow for machine learning lifecycle management.
  • Implementing machine learning pipelines in Databricks.
  • Deploying machine learning models for inference.
  • Integrating machine learning with data pipelines.

Module 14: Data Pipeline Automation and Orchestration

  • Understanding data pipeline automation and orchestration concepts.
  • Utilizing Databricks workflows and jobs for pipeline orchestration.
  • Implementing CI/CD pipelines for data engineering projects.
  • Utilizing external orchestration tools (Airflow, Prefect).
  • Implementing monitoring and alerting for data pipelines.

Module 15: Data Engineering Best Practices and Future Trends

  • Understanding data engineering best practices for cloud platforms.
  • Implementing data governance and compliance in data pipelines.
  • Exploring emerging data engineering technologies and trends.
  • Understanding the impact of AI and machine learning on data engineering.
  • Continuous learning and professional development in cloud data engineering.

Training Approach

This course will be delivered by our skilled trainers who have vast knowledge and experience as expert professionals in the fields. The course is taught in English and through a mix of theory, practical activities, group discussion and case studies. Course manuals and additional training materials will be provided to the participants upon completion of the training.

Tailor-Made Course

This course can also be tailor-made to meet organization requirement. For further inquiries, please contact us on: Email: info@skillsforafrica.org, training@skillsforafrica.org Tel: +254 702 249 449

Training Venue

The training will be held at our Skills for Africa Training Institute Training Centre. We also offer training for a group at requested location all over the world. The course fee covers the course tuition, training materials, two break refreshments, and buffet lunch.

Visa application, travel expenses, airport transfers, dinners, accommodation, insurance, and other personal expenses are catered by the participant

Certification

Participants will be issued with Skills for Africa Training Institute certificate upon completion of this course.

Airport Pickup and Accommodation

Airport pickup and accommodation is arranged upon request. For booking contact our Training Coordinator through Email: info@skillsforafrica.org, training@skillsforafrica.org Tel: +254 702 249 449

Terms of Payment: Unless otherwise agreed between the two parties’ payment of the course fee should be done 5 working days before commencement of the training.

Course Schedule
Dates Fees Location Apply
07/04/2025 - 18/04/2025 $3000 Nairobi
14/04/2025 - 25/04/2025 $3500 Mombasa
14/04/2025 - 25/04/2025 $3000 Nairobi
05/05/2025 - 16/05/2025 $3000 Nairobi
12/05/2025 - 23/05/2025 $5500 Dubai
19/05/2025 - 30/05/2025 $3000 Nairobi
02/06/2025 - 13/06/2025 $3000 Nairobi
09/06/2025 - 20/06/2025 $3500 Mombasa
16/06/2025 - 27/06/2025 $3000 Nairobi
07/07/2025 - 18/07/2025 $3000 Nairobi
14/07/2025 - 25/07/2025 $5500 Johannesburg
14/07/2025 - 25/07/2025 $3000 Nairobi
04/08/2025 - 15/08/2025 $3000 Nairobi
11/08/2025 - 22/08/2025 $3500 Mombasa
18/08/2025 - 29/08/2025 $3000 Nairobi
01/09/2025 - 12/09/2025 $3000 Nairobi
08/09/2025 - 19/09/2025 $4500 Dar es Salaam
15/09/2025 - 26/09/2025 $3000 Nairobi
06/10/2025 - 17/10/2025 $3000 Nairobi
13/10/2025 - 24/10/2025 $4500 Kigali
20/10/2025 - 31/10/2025 $3000 Nairobi
03/11/2025 - 14/11/2025 $3000 Nairobi
10/11/2025 - 21/11/2025 $3500 Mombasa
17/11/2025 - 28/11/2025 $3000 Nairobi
01/12/2025 - 12/12/2025 $3000 Nairobi
08/12/2025 - 19/12/2025 $3000 Nairobi