Re: detoast datum into the given buffer as a optimization.

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Jubilee Young <workingjubilee(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowley(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: detoast datum into the given buffer as a optimization.
Date: 2024-09-19 00:34:38
Message-ID: 87cyl0xyzl.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jubilee Young <workingjubilee(at)gmail(dot)com> writes:

> On Wed, Sep 18, 2024 at 2:23 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>>
>> On Wed, Sep 18, 2024 at 05:35:56PM +0800, Andy Fan wrote:
>> > Currently detoast_attr always detoast the data into a palloc-ed memory
>> > and then if user wants the detoast data in a different memory, user has to
>> > copy them, I'm thinking if we could provide a buf as optional argument for
>> > detoast_attr to save such wastage.
>> >
>> > [...]
>> >
>> > What do you think?
>>
>> My first thought is that this seems reasonable if there are existing places
>> where we are copying the data out of the palloc'd memory, but otherwise it
>> might be more of a prerequisite patch for the other things you mentioned.
>>
>
> This would also simplify data copying patterns many extensions have to do.
> For instance, often they have to move the data from Postgres memory into
> another language's allocation types. Or just for custom data structures,
> of course.

I thought we have, but did't check it so far. If we figured out an API,
we can use them for optimizing some existing code.

> I would suggest that this be added as new API, however, instead of a change
> to `detoast_attr`.

I agree that new API would usually be clearer than adding a more
argument, just that I don't want copy-paste too much existing
code. Actually the thing in my mind is:

struct varlena *
detoast_attr(struct varlena *attr)
{
int rawsize = toast_raw_datum_size(attr);
char *buffer = palloc(rawsize)
return detoast_attr_buffer(attr, buffer);
}

struct varlena *
detoast_attr_buffer(struct varlena *attr, char *buffer)
{
...
}

In this case:
- there is no existing code need to be changed.
- detoast_attr_buffer is tested sufficiently automatically.

> This would make different return types more sensical,
> as there is no need to implicitly allocate. It could return an error
> type?

I can't understand the error here.

>
>> * Note if caller provides a non-NULL buffer, it is the duty of caller
>> * to make sure it has enough room for the detoasted format (Usually
>> * they can use toast_raw_datum_size to get the size)
>
> I'm not entirely sure why the caller is being given the burden of checking,
> but I suppose they probably did check? I can imagine scenarios where they
> are not interested, however, and the callee always has to obtain the data
> for len written anyways. So I would probably make writable length a
> third arg.

I didn't follow up here as well:(, do you mind to explain a bit?

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-09-19 00:49:31 Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN
Previous Message Peter Smith 2024-09-19 00:08:19 Re: Add contrib/pg_logicalsnapinspect