Re: ZStandard (with dictionaries) compression support for TOAST compression

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Nikhil Kumar Veldanda <veldanda(dot)nikhilkumar17(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ZStandard (with dictionaries) compression support for TOAST compression
Date: 2025-03-06 12:52:44
Message-ID: 40265da8-b3be-4407-ba8a-4f48f2a4453f@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

06.03.2025 08:32, Nikhil Kumar Veldanda пишет:
> Hi all,
>
> The ZStandard compression algorithm [1][2], though not currently used for
> TOAST compression in PostgreSQL, offers significantly improved compression
> ratios compared to lz4/pglz in both dictionary-based and non-dictionary
> modes. Attached find for review my patch to add ZStandard compression to
> Postgres. In tests this patch used with a pre-trained dictionary achieved
> up to four times the compression ratio of LZ4, while ZStandard without a
> dictionary outperformed LZ4/pglz by about two times during compression of data.
>
> Notably, this is the first compression algorithm for Postgres that can make
> use of a dictionary to provide higher levels of compression, but
> dictionaries have to be generated and maintained, and so I’ve had to break
> new ground in that regard. To use the dictionary support requires training
> and storing a dictionary for a given variable-length column type. On a
> variable-length column, a SQL function will be called. It will sample the
> column’s data and feed it into the ZStandard training API which will return
> a dictionary. In the example, the column is of JSONB type. The SQL function
> takes the table name and the attribute number as inputs. If the training is
> successful, it will return true; otherwise, it will return false.
>
> ‘’‘
> test=# select build_zstd_dict_for_attribute('"public"."zstd"', 1);
> build_zstd_dict_for_attribute
> -------------------------------
> t
> (1 row)
> ‘’‘
>
> The sampling logic and data to feed to the ZStandard training API can vary
> by data type. The patch includes an method to write other type-specific
> training functions and includes a default for JSONB, TEXT and BYTEA. There
> is a new option called ‘build_zstd_dict’ that takes a function name as
> input in ‘CREATE TYPE’. In this way anyone can write their own type-
> specific training function by handling sampling logic and returning the
> necessary information for the ZStandard training API in “ZstdTrainingData”
> format.
>
> ```
> typedef struct ZstdTrainingData
> {
> char *sample_buffer; /* Pointer to the raw sample buffer */
> size_t *sample_sizes; /* Array of sample sizes */
> int nitems; /* Number of sample sizes */
> } ZstdTrainingData;
> ```
> This information is feed into the ZStandard train API, which generates a
> dictionary and inserts it into the dictionary catalog table. Additionally,
> we update the ‘pg_attribute’ attribute options to include the unique
> dictionary ID for that specific attribute. During compression, based on the
> available dictionary ID, we retrieve the dictionary and use it to compress
> the documents. I’ve created standard training function
> (`zstd_dictionary_builder`) for JSONB, TEXT, and BYTEA. 
>
> We store dictionary and dictid in the new catalog table ‘pg_zstd_dictionaries’
>
> ```
> test=# \d pg_zstd_dictionaries
> Table "pg_catalog.pg_zstd_dictionaries"
> Column | Type | Collation | Nullable | Default
> --------+-------+-----------+----------+---------
> dictid | oid | | not null |
> dict | bytea | | not null |
> Indexes:
> "pg_zstd_dictionaries_dictid_index" PRIMARY KEY, btree (dictid)
> ``` 
>
> This is the entire ZStandard dictionary infrastructure. A column can have
> multiple dictionaries. The latest dictionary will be identified by the
> pg_attribute attoptions. We never delete dictionaries once they are
> generated. If a dictionary is not provided and attcompression is set to
> zstd, we compress with ZStandard without dictionary. For decompression, the
> zstd-compressed frame contains a dictionary identifier (dictid) that
> indicates the dictionary used for compression. By retrieving this dictid
> from the zstd frame, we then fetch the corresponding dictionary and perform
> decompression.
>
> #############################################################################
>
> Enter toast compression framework changes,
>
> We identify a compressed datum compression algorithm using the top two bits
> of va_tcinfo (varattrib_4b.va_compressed). 
> It is possible to have four compression methods. However, based on previous
> community email discussions regarding toast compression changes[3], the
> idea of using it for a new compression algorithm has been rejected, and a
> suggestion has been made to extend it which I’ve implemented in this patch.
> This change necessitates an update to ‘varattrib_4b’ and ‘varatt_external’
> on disk structures. I’ve made sure that this changes are backward compatible. 
>
> ```
> typedef union
> {
> struct /* Normal varlena (4-byte length) */
> {
> uint32 va_header;
> char va_data[FLEXIBLE_ARRAY_MEMBER];
> } va_4byte;
> struct /* Compressed-in-line format */
> {
> uint32 va_header;
> uint32 va_tcinfo; /* Original data size (excludes header) and
> * compression method; see va_extinfo */
> char va_data[FLEXIBLE_ARRAY_MEMBER]; /* Compressed data */
> } va_compressed;
> struct
> {
> uint32 va_header;
> uint32 va_tcinfo;
> uint32 va_cmp_alg;
> char va_data[FLEXIBLE_ARRAY_MEMBER];
> } va_compressed_ext;
> } varattrib_4b;
>
> typedef struct varatt_external
> {
> int32 va_rawsize; /* Original data size (includes header) */
> uint32 va_extinfo; /* External saved size (without header) and
> * compression method */
> Oid va_valueid; /* Unique ID of value within TOAST table */
> Oid va_toastrelid; /* RelID of TOAST table containing it */
> uint32 va_cmp_alg; /* The additional compression algorithms
> * information. */
> } varatt_external;
> ```
>
> As I need to update this structs, I’ve made changes to the existing macros.
> Additionally added compression and decompression routines related to
> ZStandard as needed. These are major design changes in the patch to
> incorporate ZStandard with dictionary compression. 
>
> Please let me know what you think about all this. Are there any concerns
> with my approach? In particular, I would appreciate your thoughts on the
> on-disk changes that result from this.
>
> kind regards,
>
> Nikhil Veldanda
> Amazon Web Services: https://aws.amazon.com <https://aws.amazon.com/>
>
> [1] https://facebook.github.io/zstd/ <https://facebook.github.io/zstd/>
> [2] https://github.com/facebook/zstd <https://github.com/facebook/zstd>
> [3] https://www.postgresql.org/message-id/flat/
> YoMiNmkztrslDbNS%40paquier.xyz <https://www.postgresql.org/message-id/flat/
> YoMiNmkztrslDbNS%40paquier.xyz>

Overall idea is great.

I just want to mention LZ4 also have API to use dictionary. Its dictionary
will be as simple as "virtually prepended" text (in contrast to complex
ZStd dictionary format).

I mean, it would be great if "dictionary" will be common property for
different algorithms.

On the other hand, zstd have "super fast" mode which is actually a bit
faster than LZ4 and compresses a bit better. So may be support for
different algos is not essential. (But then we need a way to change
compression level to that "super fast" mode.)

-------
regards
Yura Sokolov aka funny-falcon

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robins Tharakan 2025-03-06 12:55:18 Re: Add pg_accept_connections_start_time() for better uptime calculation
Previous Message Burd, Greg 2025-03-06 12:40:22 Re: Expanding HOT updates for expression and partial indexes