Discover, Share & Showcase
the Best Data Engineering Projects

Explore curated Data Engineering projects from the community. Be recognized for your projects, vote for your favorites and share your own creations.

Week of Nov 23, 2025

Nov 23 - 29

61votes

13.Smart Wardrobe Suggestion

LLM Based Smart Clothing Suggestion

by Rahul Rajasekharan

14.Reddit ETL Pipeline in Docker

Reddit Data Engineering ETL Pipeline: Spark, Airflow, MinIO in Docker Medallion Architecture

+1
by Abdullah

15.Flink Sales Pipeline

Real-Time E-Commerce Sales Analytics Pipeline

+1
by nuti.krish4

16.Baskpipe

Fully AWS-native data pipelines for processing basketball (NBA) data.

+1
by dominik.zsajovic

17.Data Warehousing for Realtime Pipelines

Building a real-time data warehouse with the use of state-of-the-art tools like Apache Kafka..etc

by wahomewilberforce

18.Github Stars Monitor

Never miss a new top starred repository

by maxime.lemaitre

19.Macro Agents Economic Data Platform

From FRED to Forecasts: A Modern Data Stack for Economic Intelligence

by a.noonan

20.E2E Real-Time Data Pipeline

Real-time data pipeline with Kafka, Flink, Iceberg, Trino, and Superset.

+1
by abelst9

21.F1 Insights Real Time Replay

What if your dashboards were as realtime as Max vestappen!

+1
by hiteshkhk0105

22.Batch data pipeline

Batch Data Pipeline with Airflow, DuckDB, Delta Lake, Trino and Metabase. Observability and quality.

+1
by abelst9

23.Daggie The Airflow DAG Quality Auditor

A friendly (and sometimes strict!) animated DAG auditor for Apache Airflow 3.1+

by Rahul Rajasekharan