Re: ZStandard (with dictionaries) compression support for TOAST compression

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Nikhil Kumar Veldanda <veldanda(dot)nikhilkumar17(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: ZStandard (with dictionaries) compression support for TOAST compression
Date: 2025-03-17 20:02:54
Message-ID: CA+TgmoYziA7491-GLhU-jqZcCty9=eJSFzf3vQPiOxXUC9azsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 7, 2025 at 8:36 PM Nikhil Kumar Veldanda
<veldanda(dot)nikhilkumar17(at)gmail(dot)com> wrote:
> struct /* Extended compression format */
> {
> uint32 va_header;
> uint32 va_tcinfo;
> uint32 va_cmp_alg;
> uint32 va_cmp_dictid;
> char va_data[FLEXIBLE_ARRAY_MEMBER];
> } va_compressed_ext;
> } varattrib_4b;

First, thanks for sending along the performance results. I agree that
those are promising. Second, thanks for sending these design details.

The idea of keeping dictionaries in pg_zstd_dictionaries literally
forever doesn't seem very appealing, but I'm not sure what the other
options are. I think we've established in previous work in this area
that compressed values can creep into unrelated tables and inside
records or other container types like ranges. Therefore, we have no
good way of knowing when a dictionary is unreferenced and can be
dropped. So in that sense your decision to keep them forever is
"right," but it's still unpleasant. It would even be necessary to make
pg_upgrade carry them over to new versions.

If we could make sure that compressed datums never leaked out into
other tables, then tables could depend on dictionaries and
dictionaries could be dropped when there were no longer any tables
depending on them. But like I say, previous work suggested that this
would be very difficult to achieve. However, without that, I imagine
users generating new dictionaries regularly as the data changes and
eventually getting frustrated that they can't get rid of the old ones.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-03-17 20:04:45 Re: optimize file transfer in pg_upgrade
Previous Message Andres Freund 2025-03-17 20:00:53 Re: AIO v2.5