From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Use read streams in CREATE DATABASE command when the strategy is wal_log |
Date: | 2024-04-16 11:12:19 |
Message-ID: | CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I am working on using read streams in the CREATE DATABASE command when the
strategy is wal_log. RelationCopyStorageUsingBuffer() function is used in
this context. This function reads source buffers then copies them to the
destination buffers. I used read streams only when reading source buffers
because the destination buffers are read by 'RBM_ZERO_AND_LOCK' option, so
it is not important.
I created a ~6 GB table [1] and created a new database with the wal_log
strategy using the database that table was created in as a template [2]. My
benchmarking results are:
a. Timings:
patched:
12955.027 ms
12917.475 ms
13177.846 ms
12971.308 ms
13059.985 ms
master:
13156.375 ms
13054.071 ms
13151.607 ms
13152.633 ms
13160.538 ms
There is no difference in timings, the patched version is a tiny bit better
but it is negligible. I actually expected the patched version to be better
because there was no prefetching before, but the read stream API detects
sequential access and disables prefetching.
b. strace:
patched:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
68.02 3.749359 2 1285782 pwrite64
18.54 1.021734 21 46730 preadv
9.49 0.522889 826 633 fdatasync
2.55 0.140339 59 2368 pwritev
1.14 0.062583 409 153 fsync
master:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
59.71 3.831542 2 1288365 pwrite64
29.84 1.914540 2 747936 pread64
7.90 0.506843 837 605 fdatasync
1.58 0.101575 54 1856 pwritev
0.75 0.048431 400 121 fsync
There are fewer (~1/16) read system calls in the patched version.
c. perf:
patched:
- 97.83% 1.13% postgres postgres [.]
RelationCopyStorageUsingBuffer
- 97.83% RelationCopyStorageUsingBuffer
- 44.28% ReadBufferWithoutRelcache
+ 42.20% GetVictimBuffer
0.81% ZeroBuffer
+ 31.86% log_newpage_buffer
- 19.51% read_stream_next_buffer
- 17.92% WaitReadBuffers
+ 17.61% mdreadv
- 1.47% read_stream_start_pending_read
+ 1.46% StartReadBuffers
master:
- 97.68% 0.57% postgres postgres [.]
RelationCopyStorageUsingBuffer
- RelationCopyStorageUsingBuffer
- 65.48% ReadBufferWithoutRelcache
+ 41.16% GetVictimBuffer
- 20.42% WaitReadBuffers
+ 19.90% mdreadv
+ 1.85% StartReadBuffer
0.75% ZeroBuffer
+ 30.82% log_newpage_buffer
Patched version spends less CPU time in read calls and more CPU time in
other calls such as write.
There are three patch files attached. First two are optimization and adding
a way to create a read stream object by using SMgrRelation, these are
already proposed in the streaming I/O thread [3]. The third one is the
actual patch file.
Any kind of feedback would be appreciated.
[1] CREATE TABLE t as select repeat('a', 100) || i || repeat('b', 500) as
filler from generate_series(1, 9000000) as i;
[2] CREATE DATABASE test_1 STRATEGY 'wal_log' TEMPLATE test;
[3]
https://www.postgresql.org/message-id/CAN55FZ1yGvCzCW_aufu83VimdEYHbG_zuOY3J9JL-nBptyJyKA%40mail.gmail.com
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Refactor-PinBufferForBlock-to-remove-if-checks-ab.patch | text/x-patch | 3.6 KB |
v1-0002-Add-a-way-to-create-read-stream-object-by-using-S.patch | text/x-patch | 5.6 KB |
v1-0003-Use-read-streams-in-CREATE-DATABASE-when-strategy.patch | text/x-patch | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | shveta malik | 2024-04-16 11:36:28 | Re: Disallow changing slot's failover option in transaction block |
Previous Message | Amul Sul | 2024-04-16 11:04:54 | Re: Add bump memory context type and use it for tuplesorts |