The Need for New Low-Level API for Lakehouse-Centric Compute Engines
In this post, I will argue that today’s query engines, such as Apache Spark, do not effectively interact with lake houses due to limitations in handling object storage semantics, lack of integration with catalog systems like Iceberg for governance and lineage, and inadequate support for lakehouse-specific features. I believe