Re: Making aggregate deserialization (and WAL receive) functions slightly faster

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Date: 2023-10-09 17:38:01
Message-ID: 169185.1696873081@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> On Mon, 9 Oct 2023 at 21:17, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> Here are some more thoughts on how we could improve this:
>>
>> 1. Adjust the definition of StringInfoData.maxlen to define that -1
>> means the StringInfoData's buffer is externally managed.
>> 2. Adjust enlargeStringInfo() to add a check for maxlen = -1 and have
>> it palloc, say, pg_next_pow2(str->len * 2) bytes and memcpy the
>> existing (externally managed string) into the newly palloc'd buffer.
>> 3. Add a new function along the lines of what I originally proposed to
>> allow init of a StringInfoData using an existing allocated string
>> which sets maxlen = -1.
>> 4. Update all the existing places, including the ones I just committed
>> (plus the ones you committed in ba1e066e4) to make use of the function
>> added in #3.

Hm. I'd be inclined to use maxlen == 0 as the indicator of a read-only
buffer, just because that would not create a problem if we ever want
to change it to an unsigned type. Other than that, I agree with the
idea of using a special maxlen value to indicate that the buffer is
read-only and not owned by the StringInfo. We need to nail down the
exact semantics though.

> While working on this, I added an Assert in the new
> initStringInfoFromStringWithLen function to ensure that data[len] ==
> '\0' per the "There is guaranteed to be a terminating '\0' at
> data[len]" comment in stringinfo.h. It looks like we have some
> existing breakers of this rule.

Ugh. The point that 608fd198d also broke the terminating-nul
convention was something that occurred to me after sending
my previous message. That's something we can't readily accommodate
within the concept of a read-only buffer, but I think we can't
give it up without risking a lot of obscure bugs.

> I'll also need to revert 608fd198 as this also highlights that setting
> the StringInfoData.data to point to a bytea Datum can't be done either
> as those aren't NUL terminated strings.

Yeah. I would revert that as a separate commit and then think about
how we want to proceed, but I generally agree that there could be
value in the idea of a setup function that accepts a caller-supplied
buffer.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema 2023-10-09 18:11:07 Re: UUID v7
Previous Message Nick Babadzhanian 2023-10-09 16:46:17 Re: UUID v7