Skip to content

Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans

Published: at 12:04 PM

Apache Iceberg Masterclass - Table of Contents

  1. What Are Table Formats and Why Were They Needed?
  2. The Metadata Structure of Modern Table Formats
  3. Performance and Apache Iceberg’s Metadata
  4. Partition Evolution: Change Your Partitioning Without Rewriting Data
  5. Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans
  6. Writing to an Apache Iceberg Table: How Commits and ACID Actually Work
  7. What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg
  8. When Catalogs Are Embedded in Storage
  9. How Data Lake Table Storage Degrades Over Time
  10. Maintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup
  11. Apache Iceberg Metadata Tables: Querying the Internals
  12. Using Apache Iceberg with Python and MPP Query Engines
  13. Approaches to Streaming Data into Apache Iceberg Tables
  14. Hands-On with Apache Iceberg Using Dremio Cloud
  15. Migrating to Apache Iceberg: Strategies for Every Source System

This is Part 5 of a 15-part Apache Iceberg Masterclass. Part 4 covered partition evolution. This article covers hidden partitioning, the feature that ensures users never need to know how their data is physically organized.

The most expensive mistake in data lake querying is the accidental full table scan: a query that reads every file because the user did not correctly reference the partition columns. In Hive, this happens constantly. In Iceberg, it is structurally impossible because users never reference partition columns at all.

The Accidental Full Scan Problem

Exposed partitioning in Hive versus hidden partitioning in Iceberg showing the same pruning with different user experience

In Hive, a table partitioned by year, month, and day requires queries to filter on those exact columns:

-- Hive: This prunes correctly
SELECT * FROM orders WHERE year = 2024 AND month = 3 AND day = 15

-- Hive: This scans EVERYTHING (no pruning)
SELECT * FROM orders WHERE order_date = '2024-03-15'

The second query reads every partition because Hive does not know that order_date maps to the year, month, and day partition columns. There is no error, no warning. The query simply runs 100x slower than it should.

This happens because Hive partitioning is “exposed.” The physical partition columns (year, month, day) are separate from the source column (order_date). Users must understand this mapping and construct their filters accordingly.

How Iceberg Hides Partitioning

Iceberg flips this model. Users filter on the source column (order_date), and the engine automatically maps the filter to the partition values using transform functions.

-- Iceberg: This prunes correctly. Always.
SELECT * FROM orders WHERE order_date = '2024-03-15'

The table’s partition spec declares: PARTITIONED BY (day(order_date)). When the engine processes this query, it:

  1. Reads the partition spec from the table metadata
  2. Applies the day() transform to the filter value: day('2024-03-15') = 2024-03-15
  3. Checks manifest entries for files with matching partition values
  4. Skips every file whose partition value is not 2024-03-15

The user writes natural SQL against the source columns. The engine handles the physical-to-logical mapping. This is why it is called “hidden” partitioning: the partition structure is invisible to the user.

The Six Transform Functions

Iceberg's six partition transform functions showing how each maps source values to partition values

Iceberg defines six partition transforms that map source column values to partition values:

Temporal Transforms

TransformInputOutputUse Case
year(ts)2024-03-15 10:30:002024Low-volume tables, yearly reporting
month(ts)2024-03-15 10:30:002024-03Medium-volume tables, monthly queries
day(ts)2024-03-15 10:30:002024-03-15High-volume tables, daily queries
hour(ts)2024-03-15 10:30:002024-03-15-10Very high-volume streaming data

The temporal transforms are hierarchical. If a table is partitioned by day(ts) and a user filters WHERE ts >= '2024-03-01' AND ts < '2024-04-01', the engine recognizes this as a range of days and prunes to only the 31 matching partitions. Engines like Dremio handle this mapping automatically for equality, range, and IN-list predicates.

Value Transforms

TransformInputOutputUse Case
truncate(N, col)'New York' (N=3)'New'Grouping strings by prefix
bucket(N, col)12345 (N=16)7Even distribution of high-cardinality columns

truncate(N, col) takes the first N characters of a string (or truncates a number to a width). This is useful when you want to group data by a string prefix without creating one partition per unique value.

bucket(N, col) applies a hash function and mod N to produce a bucket number from 0 to N-1. This distributes data evenly across a fixed number of buckets, regardless of the column’s value distribution. It is the go-to transform for high-cardinality columns like user_id or order_id where identity partitioning would create millions of tiny partitions.

The Identity Transform

The identity transform (identity(col)) uses the raw column value as the partition value. This is equivalent to Hive-style partitioning, but the column is still “hidden” because the engine handles the mapping. It is useful for low-cardinality columns like region or status where each unique value should be its own partition.

How Pruning Works Under the Hood

Step-by-step flow showing how the engine maps a user query through the partition spec to prune files

The pruning process works in three phases:

Phase 1: Predicate translation. The engine examines each WHERE clause predicate and checks if the filtered column is part of the partition spec. If order_date is the source column for day(order_date), the engine can translate order_date = '2024-03-15' into a partition filter.

Phase 2: Manifest list evaluation. The manifest list stores partition value ranges per manifest. The engine checks if each manifest’s range includes the target partition value. Manifests whose range does not overlap are skipped entirely.

Phase 3: Manifest entry evaluation. For each surviving manifest, the engine checks individual file entries. Only files whose partition value matches 2024-03-15 are included in the scan plan.

This is the same pruning cascade described in Part 3, but now the partition values were derived automatically from the user’s filter on a source column.

Choosing the Right Transform

The choice of partition transform depends on data volume and query patterns:

ScenarioRecommended TransformRationale
10 GB/day of event dataday(event_time)Each day is one partition (~10 GB), well-sized files
1 TB/day of event datahour(event_time)Each hour is ~42 GB, prevents oversized partitions
500 MB/month of reportsmonth(report_date)Monthly partitions keep file counts manageable
User-level data, 10M usersbucket(64, user_id)Even distribution, avoids millions of tiny partitions
Region-based data, 5 regionsidentity(region)Only 5 partitions, each meaningfully distinct

The goal is to create partitions that are large enough to contain optimally-sized files (128-512 MB each) but small enough that partition pruning eliminates most files for typical queries.

Over-partitioning (too many small partitions) creates the small file problem: thousands of tiny files that bloat metadata and slow query planning. Under-partitioning (too few large partitions) reduces pruning effectiveness because each partition contains too much data.

Combining Transforms

Iceberg supports multi-column partition specs:

CREATE TABLE events (
  event_id BIGINT,
  event_time TIMESTAMP,
  user_id BIGINT,
  event_type STRING
) PARTITIONED BY (day(event_time), bucket(32, user_id))

This creates a two-dimensional partition space: each combination of day and user bucket is a separate partition. Queries filtering on event_time get day-level pruning. Queries filtering on user_id get bucket-level pruning. Queries filtering on both get pruning from both dimensions.

Dremio supports all Iceberg transform functions and automatically applies pruning for any combination of partition columns in the query’s WHERE clause.

Why This Matters for Teams

Hidden partitioning changes the operational model for data teams:

Data engineers define the partition strategy once in the table’s partition spec. They can change it later through partition evolution without breaking anything.

Analysts and data scientists write natural SQL against the business columns they understand. They never need to know whether the table is partitioned by day, month, or bucket. Their queries are automatically optimized.

BI tools and dashboards connect to Iceberg tables and issue standard SQL. The tools do not need to understand Iceberg’s partitioning; the engine handles the optimization. This is why hidden partitioning is essential for self-service analytics platforms like Dremio.

The net result: no accidental full table scans, no partition-aware query patterns required from users, and the ability to change the physical layout without impacting any downstream consumer. Part 6 covers what happens when data is written to an Iceberg table.

Books to Go Deeper

Free Resources