Data Modeling for the Lakehouse: What Changes

Traditional data modeling assumed you controlled the database. You defined schemas up front, enforced foreign keys at write time, and optimized with indexes. The lakehouse changes every one of those assumptions. Data lives in open file formats on object storage. Schemas evolve without rewriting data. Queries run through engines that may not enforce relational constraints. The modeling discipline is the same, but the mechanics are different. What's Different About a Lakehouse A lakehouse stores data as files — typically Parquet — on object storage like S3 or Azure Blob. An open table format like Apache Iceberg adds structure: schema definitions, partition metadata, snapshot history, and transactional guarantees. This architecture gives you more flexibility than a traditional RDBMS, but also more responsibility. There are no foreign key constraints enforced at write time. No triggers. No stored procedures. Referential integrity is your problem to solve in pipelines and views, not somethi

Data Modeling for the Lakehouse: What Changes

Related Articles

We Autoscaled to 100 Pods — Then Ran Out of IP Addresses

The Silent Shift in Software Engineering Nobody Is Talking About

I Built a Clamp() Generator — No More Media Queries for Typography

What Category Theory Teaches Us About DataFrames

卡了很久的 DDD Aggregate，被遊戲的概念解開了

Related Articles

News
We Autoscaled to 100 Pods — Then Ran Out of IP Addresses
Medium Programming • 2d ago

News
The Silent Shift in Software Engineering Nobody Is Talking About
Medium Programming • 2d ago

News
I Built a Clamp() Generator — No More Media Queries for Typography
Medium Programming • 2d ago

News
What Category Theory Teaches Us About DataFrames
Lobsters • 2d ago

News
卡了很久的 DDD Aggregate，被遊戲的概念解開了
Medium Programming • 2d ago