Re: [PATCH] Compression dictionaries for JSONB

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Nikita Malakhov <hukutoc(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [PATCH] Compression dictionaries for JSONB
Date: 2022-07-27 08:30:18
Message-ID: CAEze2WispjAT3y96USL1+D_ya1juKVxXt76DMy6rhKyDrvN06Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 27 Jul 2022 at 09:36, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com> wrote:
>
> On Sun, 17 Jul 2022 at 19:15, Nikita Malakhov <hukutoc(at)gmail(dot)com> wrote:
>
> > For using in special Toaster for JSON datatype compression dictionaries seem to be very valuable addition, but now I
> > have to agree that this feature in current state is competing with Pluggable TOAST.
>
> But I don't understand this.
>
> Why does storing a compression dictionary in the catalog prevent that
> dictionary from being used within the toaster?

The point is not that compression dictionaries in the catalog are bad
- I think it makes a lot of sense - but that the typecast -based usage
of those dictionaries in user tables (like the UI provided by zson)
effectively competes with the toaster: It tries to store the data in a
more compressed manner than the toaster currently can because it has
additional knowledge about the values being toasted.

The main difference between casting and toasting however is that
casting is fairly because it has a significantly higher memory
overhead: both the fully decompressed and the compressed values are
stored in memory at the same time at some point when you cast a value,
while only the decompressed value is stored in full in memory when
(de)toasting.

And, considering that there is an open proposal for extending the
toaster mechanism, I think that it is not specifically efficient to
work with the relatively expensive typecast -based infrastructure if
this dictionary compression can instead be added using the proposed
extensible toasting mechanism at relatively low overhead.

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-07-27 09:09:38 Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Previous Message houzj.fnst@fujitsu.com 2022-07-27 08:21:53 RE: Perform streaming logical transactions by background workers and parallel apply