04 May 2026
Exploring the Variant type in Parquet
Let's get shredding
I’ve had observability on my mind for quite some time now. 2 years ago I was trying to expand my metrics setup with structured logs emitted by some services. The ensuing exploration only yielded the obvious: Storing and efficiently querying arbitrary data is hard. DuckDB was all the rage back then, so naturally I had to see how it handles analytics workloads. Finally, a year later, I checked out ClickHouse and found it amazingly fast for my purposes.
Ever since, I’ve run a ClickHouse instance to store arbitrary JSON payloads fed by GPS location data, sensor readings and tracing data. But a couple of days ago that nagging feeling came back: What if this can be simplified? ClickHouse is by far the heaviest service I operate on my server. It randomly fills up my disk with logging tables, too. Look, ClickHouse is an engineering marvel, but it’s not my marvel. It’s someone elses marvel. And I have a need to create my own. Thing, not marvel, I mean. So here we are.
The hard part is still how to store arbitrary JSON payloads in such a way that aggregations on
fields are performant. My current solution heavily relies on the excellent Map type in ClickHouse.
To replace it, some storage engine is going to have to do some heavy lifting. My hope is that DuckDB
can take that role if we massage the data just a bit. Since I last used DuckDB, quite a bit has
changed. Particularly interesting for me: Parquet got a Variant type and DuckDB supports it.
But before we get into the weeds let me set the scene. Let’s say I have a service that sends structured logs (or wide events) to my observability system. Today they might look like this:
{
"timestamp": "2026-05-05T08:00:00Z",
"service": "marending-dev",
"severity": "error",
"message": "User Bobby Tables doesn't exist"
}
But then I want to track more information.
{
"timestamp": "2026-05-05T08:00:00Z",
"service": "marending-dev",
"severity": "error",
"message": "User Bobby Tables doesn't exist",
"user_id": "null",
"ip": "85.0.1.42",
"path": "/user/login"
}
You get the picture. The question is, how can we store these JSON blobs such that we can then
efficiently compute, say, error rate grouped by user_id? We can’t have a static schema as we might
want to add properties to this event at any time. We can chuck this as a JSON string into any
database, file format or what have you. But then any query that attempts to read from properties of
that JSON string is going to incur the significant cost of extracting the appropriate fields for
every row of interest. One way to get reasonable performance here is to ensure severity and user_id
are each contiguously laid out so that they can be efficiently aggregated.
Parquet and Variant
That brings us to Apache Parquet. Parquet is a column-oriented file format. While DuckDB has its own file format for storing data, one of its strengths is exactly supporting a wide variety of other formats, including Parquet. So you can write and query Parquet files natively with SQL within DuckDB.
Recently Parquet added the Variant type, a binary encoding of a primitives, arrays or object of arbitrary types. It allows representing structured data in such a way that it can be queried more efficiently than if the same data was serialized and stored as a string type. But the real kicker is “Variant shredding”. It’s the process of extracting some of the properties within the Variant into their own columns. This should then give us maximum performance when aggregating on those columns. And DuckDB apparently supports this!
Once I learned about this, a path towards my dream observability system emerged. Step 1: Write a service that accepts arbitrary JSON payloads, buffers them in memory and writes them to Parquet files with the payload being of type Variant every 10 seconds or so. Step 2: Every hour, read all the small Parquet files and shred them into a larger hourly file. Step 3: Possibly aggregate the hourly files into even broader files, but I’ll see when I get to it.
Now at query time, I will have at most an hour worth of data in inefficient, unshredded form. That should be easy enough to deal with. Data older than that will be in highly efficient form. And the best part? DuckDB should be able to query across many Parquet file using glob syntax, even when some have concrete, shredded columns and some don’t. But we’ll see this for ourselves.
Exploration
Let’s start with the example given in DuckDBs documentation.
CREATE TABLE events (id INTEGER, data VARIANT);
INSERT INTO events VALUES
(1, 42::VARIANT),
(2, 'hello world'::VARIANT),
(3, [1, 2, 3]::VARIANT),
(4, {'name': 'Alice', 'age': 30}::VARIANT);
Querying the table presents the expected result.
SELECT * FROM events;
┌───────┬────────────────────────────┐
│ id │ data │
│ int32 │ variant │
├───────┼────────────────────────────┤
│ 1 │ 42 │
│ 2 │ hello world │
│ 3 │ [1, 2, 3] │
│ 4 │ {'name': Alice, 'age': 30} │
└───────┴────────────────────────────┘
SELECT id, data, variant_typeof(data) AS vtype FROM events;
┌───────┬────────────────────────────┬───────────────────┐
│ id │ data │ vtype │
│ int32 │ variant │ varchar │
├───────┼────────────────────────────┼───────────────────┤
│ 1 │ 42 │ INT32 │
│ 2 │ hello world │ VARCHAR │
│ 3 │ [1, 2, 3] │ ARRAY(3) │
│ 4 │ {'name': Alice, 'age': 30} │ OBJECT(name, age) │
└───────┴────────────────────────────┴───────────────────┘
We can now also query nested fields in a Variant.
SELECT data.name FROM events;
┌─────────┐
│ name │
│ variant │
├─────────┤
│ NULL │
│ NULL │
│ NULL │
│ Alice │
└─────────┘
Writing this data as is to a parquet file is now as simple as
COPY events TO 'events.parquet' (FORMAT parquet);. And indeed, having DuckDB describe the parquet
files shape yields the expected.
DESCRIBE SELECT * FROM 'events.parquet';
┌────────────────┐
│ events.parquet │
│ │
│ id integer │
│ data variant │
└────────────────┘
But hold on. According to the docs, writing to a Parquet file should automatically shred the variant type. Shouldn’t I see the individual columns then? Turns out shredding is only a storage detail, not a logical schema change. The type stays as is. To see how the data is actually stored, we have to inspect the files metadata.
SELECT name, type, converted_type FROM parquet_schema('events.parquet');
┌───────────────┬────────────┬────────────────┐
│ name │ type │ converted_type │
│ varchar │ varchar │ varchar │
├───────────────┼────────────┼────────────────┤
│ duckdb_schema │ NULL │ NULL │
│ id │ INT32 │ INT_32 │
│ data │ NULL │ NULL │
│ metadata │ BYTE_ARRAY │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ INT32 │ INT_32 │
└───────────────┴────────────┴────────────────┘
Well, the way this is supposed to be read is as follows. id is of type INT32, the data variant
itself doesn’t really have a type but it is comprised of three parts. metadata, value and
typed_value. Of interest are value and typed_value. Each row with the data logical column is
either going to have the (fallback) byte-array value representation set, or the shredded
typed_value column. In this case, DuckDBs autoshredder apparently decided that row 1 with the
integer data variant deserves to have its representation shredded, but not the other rows.
If instead we instruct DuckDB to shred data to a struct like so
COPY events TO 'events.parquet' (
FORMAT parquet,
SHREDDING {'data': 'STRUCT(name VARCHAR, age INTEGER)'}
);
we get this
SELECT name, type, converted_type, logical_type FROM parquet_schema('events.parquet');
┌───────────────┬────────────┬────────────────┐
│ name │ type │ converted_type │
│ varchar │ varchar │ varchar │
├───────────────┼────────────┼────────────────┤
│ duckdb_schema │ NULL │ NULL │
│ id │ INT32 │ INT_32 │
│ data │ NULL │ NULL │
│ metadata │ BYTE_ARRAY │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ NULL │ NULL │
│ name │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ BYTE_ARRAY │ UTF8 │
│ age │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ INT32 │ INT_32 │
└───────────────┴────────────┴────────────────┘
Now we see data with its metadata, value and typed_value properties and also a value and
typed_value logical column for both name and age. For these the shredded types are UTF8 and
INT32 respectively.
To get an intuitive understanding, it is interesting to look at the output of
parquet-tools meta events.parquet from this tool:
{
"NumRowGroups": 1,
"RowGroups": [
{
"NumRows": 4,
"TotalByteSize": 725,
"Columns": [
{
"PathInSchema": ["id"],
"Type": "INT32",
"ConvertedType": "convertedtype=INT_32",
"Encodings": ["PLAIN"],
"CompressedSize": 41,
"UncompressedSize": 39,
"NumValues": 4,
"NullCount": 0,
"MaxValue": 4,
"MinValue": 1,
"CompressionCodec": "SNAPPY"
},
{
"PathInSchema": ["data", "metadata"],
"Type": "BYTE_ARRAY",
"Encodings": ["PLAIN"],
"CompressedSize": 52,
"UncompressedSize": 60,
"NumValues": 4,
"NullCount": 0,
"CompressionCodec": "SNAPPY"
},
{
"PathInSchema": ["data", "value"],
"Type": "BYTE_ARRAY",
"Encodings": ["PLAIN"],
"CompressedSize": 81,
"UncompressedSize": 137,
"NumValues": 4,
"NullCount": 1,
"CompressionCodec": "SNAPPY"
},
{
"PathInSchema": ["data", "typed_value", "name", "value"],
"Type": "BYTE_ARRAY",
"Encodings": ["PLAIN"],
"CompressedSize": 34,
"UncompressedSize": 119,
"NumValues": 4,
"NullCount": 4,
"CompressionCodec": "SNAPPY"
},
{
"PathInSchema": ["data", "typed_value", "name", "typed_value"],
"Type": "BYTE_ARRAY",
"ConvertedType": "convertedtype=UTF8",
"Encodings": ["PLAIN"],
"CompressedSize": 44,
"UncompressedSize": 128,
"NumValues": 4,
"NullCount": 3,
"CompressionCodec": "SNAPPY"
},
{
"PathInSchema": ["data", "typed_value", "age", "value"],
"Type": "BYTE_ARRAY",
"Encodings": ["PLAIN"],
"CompressedSize": 34,
"UncompressedSize": 119,
"NumValues": 4,
"NullCount": 4,
"CompressionCodec": "SNAPPY"
},
{
"PathInSchema": ["data", "typed_value", "age", "typed_value"],
"Type": "INT32",
"ConvertedType": "convertedtype=INT_32",
"Encodings": ["PLAIN"],
"CompressedSize": 39,
"UncompressedSize": 123,
"NumValues": 4,
"NullCount": 3,
"CompressionCodec": "SNAPPY"
}
]
}
]
}
Let’s take this line for example ["data", "typed_value", "name", "typed_value"]. A row in our
dataset is going to populate this logical column when the name property in the variant is a UTF8
string, otherwise it will fallback to the BYTE_ARRAY data.value. And indeed we see that 3 out of 4
values in this column are NULL as row 4 is the only one that can populate this typed_value.
Conversely, the ["data", "value"] entry only has a single NULL value. Only row 4 does not populate
the fallback column but the typed values instead.
If we look at ["data", "typed_value", "name", "value"] we can see that not a single non-NULL value
is here. I suppose this would be populated if we had a row with an object that contains the name
property, but with a type other than string.
I’ll just leave this blog post on the official Parquet website here for further reading. It has some nice visualizations that further explain how this works.
Autoshredding
So far so good. But we don’t want to have to specify how the values should be shredded. The whole point would be for DuckDB to figure it out from the data, but somehow in our first attempt it just chose the “materialize” the integer and not our object.
Well, it looks like DuckDB decides what to shred based on how often a specific type occurs in the table. If I add two more rows that have the same shape as the object in row 4
INSERT INTO events VALUES (5, {'name': 'Bob', 'age': 27}::VARIANT);
INSERT INTO events VALUES (6, {'name': 'Flo', 'age': 31}::VARIANT);
COPY events TO 'events-more-structs.parquet' (FORMAT parquet);
I get our familiar shredding, this time without having to specify the type.
SELECT name, type, converted_type FROM parquet_schema('events-more-structs.parquet');
┌───────────────┬────────────┬────────────────┐
│ name │ type │ converted_type │
│ varchar │ varchar │ varchar │
├───────────────┼────────────┼────────────────┤
│ duckdb_schema │ NULL │ NULL │
│ id │ INT32 │ INT_32 │
│ data │ NULL │ NULL │
│ metadata │ BYTE_ARRAY │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ NULL │ NULL │
│ age │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ INT32 │ INT_32 │
│ name │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ BYTE_ARRAY │ UTF8 │
└───────────────┴────────────┴────────────────┘
If I add a bunch more string columns now, I can get the autoshredder to only shred the UTF8 type.
Initially I would have expected that it should be possible to have both, shredded age and name
values as well as a fallback not to an untyped BYTE_ARRAY but to a UTF8 string. But thinking about
this, I guess the typed_value for data is now an object of name and age (although its not
readily visible on the above representation), so it can’t be a string simultaneously.
It should be possible however if the string columns were in a new property inside an object. Let me give this a shot.
INSERT INTO events VALUES (10, {'message': 'Hello world'}::VARIANT);
INSERT INTO events VALUES (11, {'message': 'message'}::VARIANT);
INSERT INTO events VALUES (12, {'message': '42'}::VARIANT);
COPY events TO 'events-message.parquet' (FORMAT parquet);
SELECT name, type, converted_type FROM parquet_schema('events-message.parquet');
┌───────────────┬────────────┬────────────────┐
│ name │ type │ converted_type │
│ varchar │ varchar │ varchar │
├───────────────┼────────────┼────────────────┤
│ duckdb_schema │ NULL │ NULL │
│ id │ INT32 │ INT_32 │
│ data │ NULL │ NULL │
│ metadata │ BYTE_ARRAY │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ NULL │ NULL │
│ message │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ BYTE_ARRAY │ UTF8 │
│ age │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ INT32 │ INT_32 │
│ name │ NULL │ NULL │
│ value │ BYTE_ARRAY │ NULL │
│ typed_value │ BYTE_ARRAY │ UTF8 │
└───────────────┴────────────┴────────────────┘
Indeed we can see that it now materialized age, name and message. Neat!
Real world data
With my mental model starting to get less foggy, I think it’s time to start digging into some real world data from my existing observability setup.