Re: Flushing large data immediately in pqcomm

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Flushing large data immediately in pqcomm
Date: 2024-01-30 17:41:30
Message-ID: CAGPVpCTfzhiOCWPwpRpvV6EZU0egJix4jNObp_OkhfZESdPbFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Heikki,

Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, 29 Oca 2024 Pzt, 19:12 tarihinde şunu
yazdı:

> > Proposed change modifies socket_putmessage to send any data larger than
> > 8K immediately without copying it into the send buffer. Assuming that
> > the send buffer would be flushed anyway due to reaching its limit, the
> > patch just gets rid of the copy part which seems unnecessary and sends
> > data without waiting.
>
> If there's already some data in PqSendBuffer, I wonder if it would be
> better to fill it up with data, flush it, and then send the rest of the
> data directly. Instead of flushing the partial data first. I'm afraid
> that you'll make a tiny call to secure_write(), followed by a large one,
> then a tine one again, and so forth. Especially when socket_putmessage
> itself writes the msgtype and len, which are tiny, before the payload.
>

I agree that I could do better there without flushing twice for both
PqSendBuffer and
input data. PqSendBuffer always has some data, even if it's tiny, since
msgtype and len are added.

> Perhaps we should invent a new pq_putmessage() function that would take
> an input buffer with 5 bytes of space reserved before the payload.
> pq_putmessage() could then fill in the msgtype and len bytes in the
> input buffer and send that directly. (Not wedded to that particular API,
> but something that would have the same effect)
>

I thought about doing this. The reason why I didn't was because I think
that such a change would require adjusting all input buffers wherever
pq_putmessage is called, and I did not want to touch that many different
places. These places where we need pq_putmessage might not be that many
though, I'm not sure.

>
> > This change affects places where pq_putmessage is used such as
> > pg_basebackup, COPY TO, walsender etc.
> >
> > I did some experiments to see how the patch performs.
> > Firstly, I loaded ~5GB data into a table [1], then ran "COPY test TO
> > STDOUT". Here are perf results of both the patch and HEAD > ...
> > The patch brings a ~5% gain in socket_putmessage.
> >
> > [1]
> > CREATE TABLE test(id int, name text, time TIMESTAMP);
> > INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100)
> > AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;
>
> I'm surprised by these results, because each row in that table is < 600
> bytes. PqSendBufferSize is 8kB, so the optimization shouldn't kick in in
> that test. Am I missing something?
>

You're absolutely right. I made a silly mistake there. I also think that
the way I did perf analysis does not make much sense, even if one row of
the table is greater than 8kB.
Here are some quick timing results after being sure that it triggers this
patch's optimization. I need to think more on how to profile this with
perf. I hope to share proper results soon.

I just added a bit more zeros [1] and ran [2] (hopefully measured the
correct thing)

HEAD:
real 2m48,938s
user 0m9,226s
sys 1m35,342s

Patch:
real 2m40,690s
user 0m8,492s
sys 1m31,001s

[1]
INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 10000)
AS name, NOW() AS time FROM generate_series(1, 1000000) AS i;

[2]
rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql
-d postgres -c "COPY test TO STDOUT;" > /tmp/dummy

Thanks,
--
Melih Mutlu
Microsoft

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-01-30 17:48:50 Re: Possibility to disable `ALTER SYSTEM`
Previous Message Pavel Stehule 2024-01-30 17:35:55 Re: Bytea PL/Perl transform