Re: detoast datum into the given buffer as a optimization.

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>,Jubilee Young <workingjubilee(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowley(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: detoast datum into the given buffer as a optimization.
Date: 2024-09-19 00:03:38
Message-ID: 87ikusy0f9.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Thank you all for the double check.

> Andy Fan <zhihuifan1213(at)163(dot)com> writes:
>> * Note if caller provides a non-NULL buffer, it is the duty of caller
>> * to make sure it has enough room for the detoasted format (Usually
>> * they can use toast_raw_datum_size to get the size)
>
> ..., It puts it on the caller to know how to get the detoasted length

Yes.

> and it implies double decoding of the toast datum.

Yes, We need to decoding the toast datum to know the rawsize as what we
did in toast_raw_datum_size, this is an extra effrot.

But I want to highlight that this "decoding" is different from
"detoast", the later one need to scan toast_relation or decompression
the data so it is a heavy work, but the former one just decoding some
existing memory at hand which should be very cheap.

>> One of the key point is we can always get the varlena rawsize cheaply
>> without any real detoast activity in advance, thanks to the existing
>> varlena design.
>
> This is not an assumption I care to wire into the API design.

OK. (I just was excited to find out we can get the rawsize so cheaply,
so we can find out an API to satify the both user cases.)

> How about a variant like
>
> struct varlena *
> detoast_attr_cxt(struct varlena *attr, MemoryContext cxt)
>
> which promises to allocate the result in the specified context?
> That would cover most of the practical use-cases, I think.

I think this works for my user case 1 but doesn't work for my user case 2
which requires the detoasted data is writen into a given memory
buffer (not only a certain MemoryContext). IIUC the user cases Jubilee
provided is more like user case 2.

""" (user case 2)
2. make printtup function a bit faster [2]. The patch there already
removed some palloc, memcpy effort, but it still have some chances to
optimize further. for example text_out function, it is still detoast
the datum into a palloc memory and then copy them into a StringInfo.
"""

I really want to make some progress in this direction, so thank you for
the feedback.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-09-19 00:08:19 Re: Add contrib/pg_logicalsnapinspect
Previous Message Tatsuo Ishii 2024-09-19 00:01:34 Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN