Using read stream in autoprewarm

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Using read stream in autoprewarm
Date: 2024-08-08 07:32:16
Message-ID: CAN55FZ3n8Gd+hajbL=5UkGzu_aHGRqnn+xktXq2fuds=1AOR6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am working on using the read stream in autoprewarm. I observed ~10%
performance gain with this change. The patch is attached.

The downside of the read stream approach is that a new read stream
object needs to be created for each database, relation and fork. I was
wondering if this would cause a regression but it did not (at least
depending on results of my testing). Another downside could be the
code getting complicated.

For the testing,
- I created 50 databases with each of them having 50 tables and the
size of the tables are 520KB.
- patched: 51157 ms
- master: 56769 ms
- I created 5 databases with each of them having 1 table and the size
of the tables are 3GB.
- patched: 32679 ms
- master: 36706 ms

I put debugging message with timing information in
autoprewarm_database_main() function, then run autoprewarm 100 times
(by restarting the server) and cleared the OS cache before each
restart. Also, I ensured that the block number of the buffer returning
from the read stream API is correct. I am not sure if that much
testing is enough for this kind of change.

Any feedback would be appreciated.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v1-0001-Use-read-stream-in-autoprewarm.patch text/x-patch 5.4 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2024-08-08 07:36:02 Re: Add on_error and log_verbosity options to file_fdw
Previous Message Peter Smith 2024-08-08 07:13:15 Re: Pgoutput not capturing the generated columns