Re: Proposal: PqSendBuffer removal

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Aleksei Ivanov <iv(dot)alekseii(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: PqSendBuffer removal
Date: 2020-03-06 05:32:48
Message-ID: CAMsr+YGtBZhW7yzidg3JJbdfm9QkRXx4_gnitua0qNBBu+P8_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 6 Mar 2020 at 07:27, Aleksei Ivanov <iv(dot)alekseii(at)gmail(dot)com> wrote:
>
> > What do you mean "just one syscall"? The entire point here is that it'd take more syscalls to send the same amount of data.
>
> I mean that it messages are large enough more than 2K we will need 4 syscalls without copy it to the internal buffer, but currently we will copy 8K of messages and send it using 1 call. I think that under some threshold of packet length it is redundant to copy it to internal buffer and the data can be sent directly.

I think what you're suggesting is more complex than you may expect.
PostgreSQL is single threaded and relies pretty heavily on the ability
to buffer internally. It also expects its network I/O to always
succeed. Just switching to directly doing nonblocking I/O is not very
feasible. Changing the network I/O paths may expose a lot more
opportunities for send vs receive deadlocks.

It also complicates the protocol's handling of message boundaries,
since failures and interruptions can occur at more points.

Have you measured anything that suggests that our admittedly
inefficient multiple handling of send buffers is
performance-significant compared to the vast amount of memory
allocation and copying we do all over the place elsewhere? Do you have
a concrete reason to want to remove this?

If I had to change this model I'd probably be looking at an
iovector-style approach, like we use with shm_mq. Assemble an array of
buffer descriptors pointing to short, usually statically allocated
buffers and populate one with each pqformat step. Then when the
message is assembled, use writev(2) or similar to dispatch it. Maybe
do some automatic early flushing if the buffer space overflows. But
that might need a protocol extension so we had a way to recover after
interrupted sending of a partial message...

--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2020-03-06 05:53:02 Re: logical replication empty transactions
Previous Message Amit Kapila 2020-03-06 04:49:24 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions