Re: Lowering the default wal_blocksize to 4K

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Lowering the default wal_blocksize to 4K
Date: 2023-10-11 01:39:12
Message-ID: CA+hUKG+b2ego8=YNW2Ohe9QmSiReh1-ogrv8V_WZpJTqP3O+2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 11, 2023 at 12:29 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-10-10 21:30:44 +0200, Matthias van de Meent wrote:
> > On Tue, 10 Oct 2023 at 06:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > I was thinking we should perhaps do the opposite, namely getting rid of short
> > > page headers. The overhead in the "byte position" <-> LSN conversion due to
> > > the differing space is worse than the gain. Or do something inbetween - having
> > > the system ID in the header adds a useful crosscheck, but I'm far less
> > > convinced that having segment and block size in there, as 32bit numbers no
> > > less, is worthwhile. After all, if the system id matches, it's not likely that
> > > the xlog block or segment size differ.
> >
> > Hmm. I don't think we should remove those checks, as I can see people
> > that would want to change their XLog block size with e.g.
> > pg_reset_wal.
>
> I don't think that's something we need to address in every physical
> segment. For one, there's no option to do so. But more importantly, if they
> don't change the xlog block size, we'll just accept random WAL as well. If
> somebody goes to the trouble of writing a custom tool, they can live with the
> consequences of that potentially causing breakage. Particularly if the checks
> wouldn't meaningfully prevent that anyway.

How about this idea: Put the system ID etc into the new record Robert
is proposing for the redo point, and also into the checkpoint record,
so that it's at both ends of the to-be-replayed range. That just
leaves the WAL segments in between. If you find yourself writing a
new record that would go in the first usable byte of a segment, insert
a new special system ID (etc) record that will be checked during
replay. For segments that start with XLP_FIRST_IS_CONTRECORD, don't
worry about it: those already form part of a chain of verification
(xlp_rem_len, xl_crc) that started on the preceding page, so it seems
almost impossible to accidentally replay from a segment that came from
another system.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2023-10-11 01:58:18 Re: Fix typo in psql zh_CN.po
Previous Message Noah Misch 2023-10-11 01:33:17 interval_ops shall stop using btequalimage (deduplication)