Discover, Share & Showcase
the Best Data Engineering Projects

Explore curated Data Engineering projects from the community. Be recognized for your projects, vote for your favorites and share your own creations.

Week of Nov 23, 2025

Nov 23 - 29

60votes

12.Smart Wardrobe Suggestion

LLM Based Smart Clothing Suggestion

by Rahul Rajasekharan

13.Reddit ETL Pipeline in Docker

Reddit Data Engineering ETL Pipeline: Spark, Airflow, MinIO in Docker Medallion Architecture

+1
by Abdullah

14.Flink Sales Pipeline

Real-Time E-Commerce Sales Analytics Pipeline

+1
by nuti.krish4

15.Baskpipe

Fully AWS-native data pipelines for processing basketball (NBA) data.

+1
by dominik.zsajovic

16.Data Warehousing for Realtime Pipelines

Building a real-time data warehouse with the use of state-of-the-art tools like Apache Kafka..etc

by wahomewilberforce

17.Github Stars Monitor

Never miss a new top starred repository

by maxime.lemaitre

18.Macro Agents Economic Data Platform

From FRED to Forecasts: A Modern Data Stack for Economic Intelligence

by a.noonan

19.E2E Real-Time Data Pipeline

Real-time data pipeline with Kafka, Flink, Iceberg, Trino, and Superset.

+1
by abelst9

20.F1 Insights Real Time Replay

What if your dashboards were as realtime as Max vestappen!

+1
by hiteshkhk0105

21.Batch data pipeline

Batch Data Pipeline with Airflow, DuckDB, Delta Lake, Trino and Metabase. Observability and quality.

+1
by abelst9

22.Daggie The Airflow DAG Quality Auditor

A friendly (and sometimes strict!) animated DAG auditor for Apache Airflow 3.1+

by Rahul Rajasekharan