From: | "Bagga, Rishu" <bagrishu(at)amazon(dot)com> |
---|---|
To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | SLRUs in the main buffer pool - Page Header definitions |
Date: | 2022-06-22 21:06:29 |
Message-ID: | EFAAC0BE-27E9-4186-B925-79B7C696D5AC@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi all,
PostgreSQL currently maintains several data structures in the SLRU
cache. The SLRU cache has scaling and sizing challenges because of it’s
simple implementation. The goal is to move these caches to the common
buffer cache to benefit from the stronger capabilities of the common
buffercache code. At AWS, we are building on the patch shared by Thomas
Munro [1], which treats the SLRU pages as part of a pseudo-databatabe
of ID 9. We will refer to the pages belonging to SLRU components as
BufferedObject pages going forward.
The current SLRU pages do not have any header, so there is a need to
create a new page header format for these. Our investigations revealed
that we need to:
1. track LSN to ensure durability and consistency of all pages (for redo
and full page write purposes)
2. have a checksum (for page correctness verification).
3. A flag to identify if the page is a relational or BufferedObject
4. Track version information.
We are suggesting a minimal BufferedObject page header
to be the following, overlapping with the key fields near the beginning
of the regular PageHeaderData:
typedef struct BufferedObjectPageHeaderData
{
PageXLogRecPtr pd_lsn;
uint16_t pd_checksum;
uint16_t pd_flags;
uint16_t pd_pagesize_version;
} BufferedObjectPageHeaderData;
For reference, the regular page header looks like the following:
typedef struct PageHeaderData
{
PageXLogRecPtr pd_lsn;
uint16_t pd_checksum;
uint16_t pd_flags;
LocationIndex pd_lower;
LocationIndex pd_upper;
LocationIndex pd_special;
uint16_t pd_pagesize_version;
TransactionId pd_prune_xid;
ItemIdDataCommon pd_linp[];
} PageHeaderData;
After careful review, we have trimmed out the heap and index specific
fields from the suggested header that do not add any value to SLRU
components. We plan to use pd_lsn, pd_checksum, and pd_pagesize_version
in the same way that they are in relational pages. These fields are
needed to ensure consistency, durability and page correctness.
We will use the 4th bit of pd_flags to identify a BufferedObject page.
If the bit is set then this denotes a BufferedObject page. Today, bits
1 - 3 are used for determining if there are any free line pointers, if
the page is full, and if all tuples on the page are visible to
everyone, respectively. We will use this information accordingly in the
storage manager to determine which callback functions to use for file
I/O operations. This approach allows the buffercache to have an
universal method to quickly determine what type of page it is dealing
with at any time.
Using the new BufferedObject page header will be space efficient but
introduces a significant change in the codebase to now track two types
of page header data. During upgrade, all SLRU files that exist on the
system must be converted to the new format with page header. This will
require rewriting all the SLRU pages with the page header as part of
pg_upgrade.
We believe that this is the correct approach for the long run. We would
love feedback if there are additional items of data that should be
tracked as well. Alternatively, we could re-use the existing page
header and the unused fields could be used as a padding. This feels
like an unclean approach but would avoid having two page header types
in the database.
Discussed with: Joe Conway, Nathan Bossart, Shawn Debnath
Rishu Bagga
Amazon Web Services (AWS)
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-06-22 21:39:43 | Re: SLRUs in the main buffer pool - Page Header definitions |
Previous Message | Ibrar Ahmed | 2022-06-22 20:55:00 | Re: explain analyze rows=%.0f |