Use read streams in pg_visibility

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Use read streams in pg_visibility
Date: 2024-08-13 12:22:27
Message-ID: CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am working on using the read stream in pg_visibility. There are two
places to use it:

1- collect_visibility_data()

This one is straightforward. I created a read stream object if
'include_pd' is true because read I/O is done when 'include_pd' is
true. There is ~4% timing improvement with this change. I started the
server with the default settings and created a 6 GB table. Then run
100 times pg_visibility() by clearing the OS cache between each run.
----------

2- collect_corrupt_items()

This one is more complicated. The read stream callback function loops
until it finds a suitable block to read. So, if the callback returns
an InvalidBlockNumber; it means that the stream processed all possible
blocks and the stream should be finished. There is ~3% timing
improvement with this change. I started the server with the default
settings and created a 6 GB table. Then run 100 times
pg_check_visible() by clearing the OS cache between each run.

The downside of this approach is there are too many "vmbuffer is valid
but BufferGetBlockNumber(*vmbuffer) is not equal to mapBlock, so
vmbuffer needs to be read again" cases in the read stream version (700
vs 20 for the 6 GB table). This is caused by the callback function of
the read stream reading a new vmbuf while getting next block numbers.
So, vmbuf is wrong when we are checking visibility map bits that might
have changed while we were acquiring the page lock.
----------

Both patches are attached.

Any feedback would be appreciated.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v1-0001-Use-read-stream-in-pg_visibility-in-collect_visib.patch text/x-patch 3.4 KB
v1-0002-Use-read-stream-in-pg_visibility-in-collect_corru.patch text/x-patch 5.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilia Evdokimov 2024-08-13 13:18:48 Re: Vacuum statistics
Previous Message vignesh C 2024-08-13 12:03:42 Re: Logical Replication of sequences