Re: Why is pq_begintypsend so slow?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jack Christensen <jack(at)jncsoftware(dot)com>, David Fetter <david(at)fetter(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is pq_begintypsend so slow?
Date: 2020-06-09 19:23:45
Message-ID: 20200609192345.x2itfl3v6eqg6l4y@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-06-09 11:46:09 -0400, Robert Haas wrote:
> On Wed, Jun 3, 2020 at 2:10 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Why do we need multiple buffers? ISTM we don't want to just send
> > messages at endmsg() time, because that implies unnecessary syscall
> > overhead. Nor do we want to imply the overhead of the copy from the
> > message buffer to the network buffer.
>
> It would only matter if there are multiple messages being constructed
> at the same time, and that's probably not common, but maybe there's
> some way it can happen.

ISTM that it'd be pretty broken if it could happen. We cannot have two
different parts of the system send messages to the client
independently. The protocol is pretty stateful...

> > To me that seems to imply that the best approach would be to have
> > PqSendBuffer be something stringbuffer like, and have pg_beginmessage()
> > record the starting position of the current message somewhere
> > (->cursor?). When an error is thrown, we reset the position to be where
> > the in-progress message would have begun.
>
> Yeah, I thought about that, but then how you detect the case where two
> different people try to undertake message construction at the same
> time?

Set a boolean and assert out if one already is in progress? We'd need
some state to know where to reset the position to on error anyway.

> Like, with the idea I was proposing, you could still decide to limit
> yourself to 1 buffer at the same time, and just elog() if someone
> tries to allocate a second buffer when you've already reached the
> maximum number of allocated buffers (i.e. one). But if you just have
> one buffer in a global variable and everybody writes into it, you
> might not notice if some unrelated code writes data into that buffer
> in the middle of someone else's message construction. Doing it the way
> I proposed, writing data requires passing a buffer pointer, so you can
> be sure that somebody had to get the buffer from somewhere... and any
> rules you want to enforce can be enforced at that point.

I'd hope that we'd encapsulate the buffer management into file local
variables in pqcomm.c or such, and that code outside of that cannot
access the out buffer directly without using the appropriate helpers.

> > I've before wondered / suggested that we should have StringInfos not
> > insist on having one consecutive buffer (which obviously implies needing
> > to copy contents when growing). Instead it should have a list of buffers
> > containing chunks of the data, and never copy contents around while the
> > string is being built. We'd only allocate a buffer big enough for all
> > data when the caller actually wants to have all the resulting data in
> > one string (rather than using an API that can iterate over chunks).
>
> It's a thought. I doubt it's worth it for small amounts of data, but
> for large amounts it might be. On the other hand, a better idea still
> might be to size the buffer correctly from the start...

I think those are complimentary. I do agree that's it's useful to size
stringinfos more appropriately immediately (there's an upthread patch
adding a version of initStringInfo() that does so, quite useful for
small stringinfos in particular). But there's enough cases where that's
not really knowable ahead of time that I think it'd be quite useful to
have support for the type of buffer I describe above.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-06-09 19:37:23 global barrier & atomics in signal handlers (Re: Atomic operations within spinlocks)
Previous Message Robert Haas 2020-06-09 19:20:08 Re: elog(DEBUG2 in SpinLocked section.