Discover data engineering projects built with Python. Browse workflows, pipelines, and integrations from the community.
20 projects found
Production-ready data visualization for NASA dataset
A self-healing data pipeline platform built on Airflow
A synthetic data generator that guarantees deterministic output
Real-Time Vehicle Data Processing Pipeline
Production-grade analytics pipeline
Predict, simulate, and debug Airflow schedules before they fail.
An Airflow UI plugin for monitoring DAG failures and SLA misses/delays.
Visualize how your Airflow DAG schedules are distributed across the day with an interactive heatmap
A lightweight, graph-based Semantic Layer. Define metrics once, Generate SQL everywhere.
Drift Detective is a Python library for tracking schema evolution using versioned JSON snapshots
Enterprise-grade ETL pipeline transforming medical XML data into actionable business intelligence
Python React Experiment
CAP is an end-to-end cricket analytics platform built on Cricsheet ball-by-ball data
Reddit Data Engineering ETL Pipeline: Spark, Airflow, MinIO in Docker Medallion Architecture
Fully AWS-native data pipelines for processing basketball (NBA) data.
Never miss a new top starred repository
A friendly (and sometimes strict!) animated DAG auditor for Apache Airflow 3.1+
A powerful CLI tool that generates LLM-powered documentation for dbt models and columns
SCALABLE_YAHOO_API_ETL_PIPELINE_USING_AIRFLOW
Bulk manage Airflow DAG states effortlessly — pause or unpause in one action.