Discover, Share & Showcase
the Best Data Engineering Projects
Explore curated Data Engineering projects from the community. Be recognized for your projects, vote for your favorites and share your own creations.
Last Week
Apr 5 - 11
Week of Mar 29
Mar 29 - Apr 4
Week of Mar 22
Mar 22 - 28
Week of Mar 15
Mar 15 - 21
Week of Mar 8
Mar 8 - 14
Week of Mar 1
Mar 1 - 7
Week of Feb 22
Feb 22 - 28
Week of Dec 28, 2025
Dec 28 - Jan 3
Week of Dec 21, 2025
Dec 21 - 27
14.Silism Commerce 360 - Local AI Native Lakehouse
AI-native e-commerce data platform you can run locally (Airflow + dbt + MCP)
15.Real-Time-Sales-Streaming-Pipeline
Modern Lakehouse Architecture with Kafka + Spark Structured Streaming + Delta Lake
16.AIRFLow Medical Data Pipeline
Enterprise-grade ETL pipeline transforming medical XML data into actionable business intelligence
17.Bluesky NBA Real-Time Sentiment Analysis
A real-time data streaming pipeline that captures live posts from Bluesky regarding the NBA, perform
Week of Dec 14, 2025
Dec 14 - 20
18.Airflow Python React Widgets
Python React Experiment
19.Yelp Batch ETL Pipeline
A batch ETL pipeline that processes Yelp business raw data to generate analytics and insights
20.Airflow and DBT Analytics Accelerator
An Open-source accelerator for a ready-to-run, end-to-end analytics platform
21.Cricket Analytics Data Pipeline
CAP is an end-to-end cricket analytics platform built on Cricsheet ball-by-ball data
Week of Dec 7, 2025
Dec 7 - 13
Week of Nov 23, 2025
Nov 23 - 29
23.Smart Wardrobe Suggestion
LLM Based Smart Clothing Suggestion
24.Reddit ETL Pipeline in Docker
Reddit Data Engineering ETL Pipeline: Spark, Airflow, MinIO in Docker Medallion Architecture
25.Flink Sales Pipeline
Real-Time E-Commerce Sales Analytics Pipeline
26.Baskpipe
Fully AWS-native data pipelines for processing basketball (NBA) data.
27.Data Warehousing for Realtime Pipelines
Building a real-time data warehouse with the use of state-of-the-art tools like Apache Kafka..etc
28.Github Stars Monitor
Never miss a new top starred repository
29.Macro Agents Economic Data Platform
From FRED to Forecasts: A Modern Data Stack for Economic Intelligence
30.E2E Real-Time Data Pipeline
Real-time data pipeline with Kafka, Flink, Iceberg, Trino, and Superset.