From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: XLog size reductions: Reduced XLog record header size for PG17 |
Date: | 2023-09-25 17:40:00 |
Message-ID: | CAEze2WgrJCVz2y-z_OYEBRbC=YH21L_UDHycMH-KsR8THh4bTA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:
> > V5 is a rebased version of v4, and includes the latest patch from
> > "smaller XLRec block header" [0] as 0001.
>
> 0001 and 0007 are the meat of the changes.
Correct.
> -#define XLR_CHECK_CONSISTENCY 0x02
> +#define XLR_CHECK_CONSISTENCY (0x20)
>
> I can't help but notice that there are a few stylistic choices like
> this one that are part of the patch. Using parenthesis in the case of
> hexa values is inconsistent with the usual practices I've seen in the
> tree.
Yes, I'll take another look at that.
> #define COPY_HEADER_FIELD(_dst, _size) \
> do { \
> - if (remaining < _size) \
> + if (remaining < (_size)) \
> goto shortdata_err; \
>
> There are a couple of stylistic changes like this one, that I guess
> could just use their own patch to make these macros easier to use.
They actually fix complaints of my IDE, but are otherwise indeed stylistic.
> -#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
> +#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK)
> +#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK)
>
> This stuff in 0002 is independent of 0001, am I right? Doing this
> split with an extra macro is okay by me, reducing the presence of
> XLR_INFO_MASK and bitwise operations based on it.
Yes, that change is to stop making use of (~XLR_INFO_MASK) where
XLR_RMGR_INFO_MASK is the correct bitmask (whilst also being quite
useful in the later patch).
> 0003 is also mechanical, but if you begin to enforce the use of
> XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
> identity callback, we should have at least a validity check to make
> sure that nothing, even custom RMGRs, pass down unexpected bits?
I think that's already handled in XLogInsert(), but I'll make sure to
add more checks if they're not in place yet.
> I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and
> I fear that people are going to forget to set it. Wouldn't it be
> better to use an option where the XID is excluded instead, making the
> inclusing the an XID the default?
Most rmgrs don't actually use the XID. Only XACT, MULTIXACT, HEAP,
HEAP2, and LOGICALMSG use the xid, so I thought it would be easier to
just find the places where those RMGR's records were being logged than
to update all other places.
I don't mind changing how we decide to log the XID, but I don't think
EXCLUDE_XID is a good alternative: most records just don't need the
transaction ID. There are many more index AMs with logging than table
AMs, so I don't think it is that weird to default to 'not included'.
> > The resource manager has ID = 0, thus requiring some special
> > handling in other code. Apart from being generally useful, it is
> > used in future patches to detect the end of wal in lieu of a zero-ed
> > fixed-size xl_tot_len field.
>
> Err, no, that may not be true. See for example this thread where the
> topic of improving the checks of xl_tot_len and rely on this value on
> when a record header has been validated, even across page borders:
> https://www.postgresql.org/message-id/17928-aa92416a70ff44a2@postgresql.org
Yes, there are indeed exceptions when reusing WAL segments, but it's
still a good canary, like xl_tot_len before this patch.
> Except that, in which cases could an invalid RMGR be useful?
A sentinel value that is obviously invalid is available for several
types, e.g. BlockNumber, TransactionId, XLogRecPtr, Buffer, and this
is quite useful if you want to check if something is definitely
invalid. I think that's fine in principle, we're already "wasting"
some IDs in the gap between RM_MAX_BUILTIN_ID and RM_MIN_CUSTOM_ID.
In the current xlog infrastructure, we use xl_tot_len as that sentinel
to detect whether a new record may exist, but in this patch that can't
be used because the field may not exist and depends on other bytes. So
I used xl_rmgr_id as the field to base the 'may a next record exist'
checks on, which required the 0 rmgr ID to be invalid.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2023-09-25 17:56:36 | Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION } |
Previous Message | Robert Haas | 2023-09-25 17:35:16 | Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION } |