Course Curriculum
MODULE 1: Data Warehousing Fundamentals (Weeks 1-2)
- Introduction to Data Warehousing Concepts
- Why Organizations Need Data Warehouses
- Data Warehouse vs Operational Database
- Data Warehouse Architecture Components
- Source Systems, Staging Area, Data Warehouse, Data Marts
- OLTP vs OLAP: Key Differences
- Understanding OLAP Operations (Slice, Dice, Drill-down, Roll-up)
- Dimensions, Facts, and Measures
- Slowly Changing Dimensions (SCD Types 1, 2, 3)
- Star Schema Design
- Snowflake Schema Design
- Galaxy Schema
Hands-on Lab: Design a star schema for a retail business
MODULE 2: ETL Concepts & Architecture (Weeks 3-4)
- ETL Overview: Extract, Transform, Load
- ETL vs ELT: When to Use Which
- ETL Architecture and Data Flow
- Source Systems and Data Extraction
- Full Load vs Incremental Load
- Data Transformation Rules and Logic
- Data Cleansing and Standardization
- Data Quality Checks
- Loading Strategies and Best Practices
- ETL Metadata Management
- ETL Tool Overview (Informatica, SSIS, DataStage, Talend)
- ETL Advantages and Limitations
Project: Document ETL workflow for a sample business scenario
MODULE 3: SQL for ETL Testing (Week 5)
- SQL Fundamentals Review
- Complex Queries with Multiple Joins
- Aggregate Functions and GROUP BY
- Subqueries and Correlated Subqueries
- Window Functions for Data Analysis
- Set Operations: UNION, INTERSECT, EXCEPT
- Data Comparison Queries
- Duplicate Detection Queries
- Data Profiling with SQL
- Query Optimization for Large Datasets
Hands-on Lab: Write SQL queries for common ETL testing scenarios
MODULE 4: ETL Testing Validation (Weeks 6-7)
- Need for ETL Testing in Data Projects
- ETL Testing vs Traditional Database Testing
- ETL Testing Life Cycle
- Test Planning and Strategy
- ETL Testing Scenarios and Test Cases
- Source to Target Data Validation
- Data Completeness Testing
- Data Accuracy and Integrity Testing
- Data Transformation Testing
- Duplicate Data Testing
- NULL and Default Value Testing
- Date and Numeric Field Testing
- Boundary Value Testing
- Negative Testing Scenarios
- Performance Testing for ETL Jobs
- Regression Testing Strategies
Project: Create comprehensive ETL test plan with 50+ test cases
MODULE 5: Big Data Testing & Advanced Topics (Week 8)
- Introduction to Big Data Concepts
- Big Data Ecosystem Overview (Hadoop, Spark)
- Big Data Testing Challenges
- Testing Data in Distributed Systems
- Data Lake Testing Concepts
- Cloud Data Warehouse Testing (AWS Redshift, Azure Synapse)
- Data Pipeline Testing
- Real-time Data Testing Approaches
- Defect Reporting and Management
- ETL Testing Best Practices
Final Project: End-to-end ETL testing project with documentation
CAPSTONE PROJECT (Throughout Weeks 7-8)
Complete a real-world ETL testing project:
- Analyze source systems and target data warehouse
- Create detailed test plan and test cases
- Execute source-to-target validation
- Perform data quality checks
- Document defects and recommendations
- Present findings to stakeholders
Evaluation Criteria
Test coverage (25%), SQL proficiency (25%), Documentation (20%), Defect reporting (15%), Presentation (15%)