Infrastructure, frameworks, and tooling
Discover open-source data engineering projects in Data Platform from the community.
17 projects found
Production-ready data visualization for NASA dataset
A self-healing data pipeline platform built on Airflow
A synthetic data generator that guarantees deterministic output
Production-grade analytics pipeline
An Airflow UI plugin for monitoring DAG failures and SLA misses/delays.
Visualize how your Airflow DAG schedules are distributed across the day with an interactive heatmap
A lightweight, graph-based Semantic Layer. Define metrics once, Generate SQL everywhere.
Monitor, inspect and trigger Airflow DAGs from the comforts of your terminal
Drift Detective is a Python library for tracking schema evolution using versioned JSON snapshots
AI-native e-commerce data platform you can run locally (Airflow + dbt + MCP)
Python React Experiment
An Open-source accelerator for a ready-to-run, end-to-end analytics platform
Reddit Data Engineering ETL Pipeline: Spark, Airflow, MinIO in Docker Medallion Architecture
Building a real-time data warehouse with the use of state-of-the-art tools like Apache Kafka..etc
Real-time data pipeline with Kafka, Flink, Iceberg, Trino, and Superset.
Batch Data Pipeline with Airflow, DuckDB, Delta Lake, Trino and Metabase. Observability and quality.
Bulk manage Airflow DAG states effortlessly — pause or unpause in one action.