End-to-End Analytics Pipeline
Github: Repository
Overview
An end-to-end analytics pipeline for an e-commerce dataset — from raw ingestion through structured transformation to KPI dashboards. Built around the ELT pattern with dbt as the transformation layer.
Pipeline Architecture
- Ingestion (Python) — Raw order and product data loaded into a PostgreSQL staging schema.
- Transformation (dbt) — Data modeled into a star schema with 10+ tables. Data quality tests enforced at every layer.
- Visualization (Metabase) — Interactive dashboard exposing key business KPIs.
Data Model
A star schema with:
fct_ordersandfct_order_lines— core transactional factsdim_customers,dim_products,dim_dates— supporting dimensions
Data Quality
dbt tests applied across all models:
- Uniqueness — primary keys are unique
- Not null — required fields are always populated
- Referential integrity — foreign keys validated against dimension tables
Dashboard KPIs
- Total revenue and Average Order Value (AOV)
- Revenue by product category and time period
- Order volume trends and customer frequency
Key Technologies
- dbt — transformation, testing, and lineage documentation
- PostgreSQL — analytical warehouse
- Metabase — self-serve BI and dashboarding
- Docker Compose — fully reproducible local environment