From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Using read stream in autoprewarm |
Date: | 2024-08-08 07:32:16 |
Message-ID: | CAN55FZ3n8Gd+hajbL=5UkGzu_aHGRqnn+xktXq2fuds=1AOR6Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I am working on using the read stream in autoprewarm. I observed ~10%
performance gain with this change. The patch is attached.
The downside of the read stream approach is that a new read stream
object needs to be created for each database, relation and fork. I was
wondering if this would cause a regression but it did not (at least
depending on results of my testing). Another downside could be the
code getting complicated.
For the testing,
- I created 50 databases with each of them having 50 tables and the
size of the tables are 520KB.
- patched: 51157 ms
- master: 56769 ms
- I created 5 databases with each of them having 1 table and the size
of the tables are 3GB.
- patched: 32679 ms
- master: 36706 ms
I put debugging message with timing information in
autoprewarm_database_main() function, then run autoprewarm 100 times
(by restarting the server) and cleared the OS cache before each
restart. Also, I ensured that the block number of the buffer returning
from the read stream API is correct. I am not sure if that much
testing is enough for this kind of change.
Any feedback would be appreciated.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Use-read-stream-in-autoprewarm.patch | text/x-patch | 5.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | torikoshia | 2024-08-08 07:36:02 | Re: Add on_error and log_verbosity options to file_fdw |
Previous Message | Peter Smith | 2024-08-08 07:13:15 | Re: Pgoutput not capturing the generated columns |