Quick Links

Using read stream in autoprewarm

From:	Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Using read stream in autoprewarm
Date:	2024-08-08 07:32:16
Message-ID:	CAN55FZ3n8Gd+hajbL=5UkGzu_aHGRqnn+xktXq2fuds=1AOR6Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

I am working on using the read stream in autoprewarm. I observed ~10%
performance gain with this change. The patch is attached.

The downside of the read stream approach is that a new read stream
object needs to be created for each database, relation and fork. I was
wondering if this would cause a regression but it did not (at least
depending on results of my testing). Another downside could be the
code getting complicated.

For the testing,
- I created 50 databases with each of them having 50 tables and the
size of the tables are 520KB.
- patched: 51157 ms
- master: 56769 ms
- I created 5 databases with each of them having 1 table and the size
of the tables are 3GB.
- patched: 32679 ms
- master: 36706 ms

I put debugging message with timing information in
autoprewarm_database_main() function, then run autoprewarm 100 times
(by restarting the server) and cleared the OS cache before each
restart. Also, I ensured that the block number of the buffer returning
from the read stream API is correct. I am not sure if that much
testing is enough for this kind of change.

Any feedback would be appreciated.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment	Content-Type	Size
v1-0001-Use-read-stream-in-autoprewarm.patch	text/x-patch	5.4 KB

Responses

Re: Using read stream in autoprewarm at 2024-10-31 18:18:21 from Andrey M. Borodin
Re: Using read stream in autoprewarm at 2024-11-27 13:50:29 from Matheus Alcantara

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	torikoshia	2024-08-08 07:36:02	Re: Add on_error and log_verbosity options to file_fdw
Previous Message	Peter Smith	2024-08-08 07:13:15	Re: Pgoutput not capturing the generated columns