Re: Shared detoast Datum proposal

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Michael Zhilin <m(dot)zhilin(at)postgrespro(dot)ru>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Shared detoast Datum proposal
Date: 2024-03-06 20:58:48
Message-ID: 233cce7b-d9ae-4605-9b76-1f3b98cbc7f0@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/5/24 10:03, Nikita Malakhov wrote:
> Hi,
>
> Tomas, sorry for confusion, in my previous message I meant exactly
> the same approach you've posted above, and came up with almost
> the same implementation.
>
> Thank you very much for your attention to this thread!
>
> I've asked Andy about this approach because of the same reasons you
> mentioned - it keeps cache code small, localized and easy to maintain.
>
> The question that worries me is the memory limit, and I think that cache
> lookup and entry invalidation should be done in toast_tuple_externalize
> code, for the case the value has been detoasted previously is updated
> in the same query, like
>

I haven't thought very much about when we need to invalidate entries and
where exactly should that happen. My patch is a very rough PoC that I
wrote in ~1h, and it's quite possible it will have to move or should be
refactored in some way.

But I don't quite understand the problem with this query:

> UPDATE test SET value = value || '...';

Surely the update creates a new TOAST value, with a completely new TOAST
pointer, new rows in the TOAST table etc. It's not updated in-place. So
it'd end up with two independent entries in the TOAST cache.

Or are you interested just in evicting the old value simply to free the
memory, because we're not going to need that (now invisible) value? That
seems interesting, but if it's too invasive or complex I'd just leave it
up to the LRU eviction (at least for v1).

I'm not sure why should anything happen in toast_tuple_externalize, that
seems like a very low-level piece of code. But I realized there's also
toast_delete_external, and maybe that's a good place to invalidate the
deleted TOAST values (which would also cover the UPDATE).

> I've added cache entry invalidation and data removal on delete and update
> of the toasted values and am currently experimenting with large values.
>

I haven't done anything about large values in the PoC patch, but I think
it might be a good idea to either ignore values that are too large (so
that it does not push everything else from the cache) or store them in
compressed form (assuming it's small enough, to at least save the I/O).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-03-06 21:34:54 Re: Make attstattarget nullable
Previous Message Tomas Vondra 2024-03-06 20:42:38 Re: Shared detoast Datum proposal