Tag: Apache Arrow
All the articles with the tag "Apache Arrow".
What is Apache Parquet? Columns, Encoding, and Performance
Published: at 12:00 PMIf you ask a data analyst to calculate the average transaction amount for the month of July using a massive CSV file, the compute engine must read eve...
What is Apache Polaris? Unifying the Iceberg Ecosystem
Published: at 12:00 PMTreating thousands of Parquet files as a unified database table requires a brain. Apache Iceberg provides the metadata structure to do this, but the Iceberg specification alone does not manage security roles, handle network requests, or broker credentials. You need an open catalog service to orchestrate those root metadata pointers. Apache Polaris serves as that open-source, vendor-neutral brain. This comprehensive guide explains the catalog fragmentation war, open governance under the Apache Software Foundation, role-based access control hierarchies, credential vending vs IAM sprawl, and how Polaris powers Dremio's agentic query acceleration.
Apache Software Foundation: History, Purpose, and Process
Published: at 12:00 PMIf you build a modern data lakehouse, you inevitably stack Apache Iceberg, Apache Parquet, and Apache Arrow. These projects dictate how you store, que...