Re: Shared detoast Datum proposal

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Michael Zhilin <m(dot)zhilin(at)postgrespro(dot)ru>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Shared detoast Datum proposal
Date: 2024-02-26 13:22:39
Message-ID: 87frxf1gld.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi,

> When we store and detoast large values, say, 1Gb - that's a very likely scenario,
> we have such cases from prod systems - we would end up in using a lot of shared
> memory to keep these values alive, only to discard them later.

First the feature can make sure if the original expression will not
detoast it, this feature will not detoast it as well. Based on this, if
there is a 1Gb datum, in the past, it took 1GB memory as well, but
it may be discarded quicker than the patched version. but patched
version will *not* keep it for too long time since once the
slot->tts_values[*] is invalidated, the memory will be released
immedately.

For example:

create table t(a text, b int);
insert into t select '1-gb-length-text' from generate_series(1, 100);

select * from t where a like '%gb%';

The patched version will take 1gb extra memory only.

Are you worried about this case or some other cases?

> Also, toasted values
> are not always being used immediately and as a whole, i.e. jsonb values are fully
> detoasted (we're working on this right now) to extract the smallest value from
> big json, and these values are not worth keeping in memory. For text values too,
> we often do not need the whole value to be detoasted.

I'm not sure how often this is true, espeically you metioned text data
type. I can't image why people acquire a piece of text and how can we
design a toasting system to fulfill such case without detoast the whole
as the first step. But for the jsonb or array data type, I think it is
more often. However if we design toasting like that, I'm not sure if it
would slow down other user case. for example detoasting the whole piece
use case. I am justing feeling both way has its user case, kind of heap
store and columna store.

FWIW, as I shown in the test case, this feature is not only helpful for
big datum, it is also helpful for small toast datum.

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-02-26 13:24:37 Re: Synchronizing slots from primary to standby
Previous Message Bertrand Drouvot 2024-02-26 13:13:56 Re: Synchronizing slots from primary to standby