Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans
Date: 2022-09-21 21:41:28
Message-ID: CAH2-Wz=WRbum7u+hi8GLeX_h_GsUYDh9vgp_0wfnT3MJ+g0atA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 21, 2022 at 2:11 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > Presumably a
> > generic WAL record compression mechanism could be reused for other large
> > records, too. That could be much easier than devising a deduplication
> > strategy for every record type.
>
> It's quite possible that that's a good idea, but that should probably
> work as an additive thing. That's something that I think of as a
> "clever technique", whereas I'm focussed on just not being naive in
> how we represent this one specific WAL record type.

BTW, if you wanted to pursue something like this, that would work with
many different types of WAL record, ISTM that a "medium level" (not
low level) approach might be the best place to start. In particular,
the way that page offset numbers are represented in many WAL records
is quite space inefficient. A domain-specific approach built with
some understanding of how page offset numbers tend to look in practice
seems promising.

The representation of page offset numbers in PRUNE and VACUUM heapam
WAL records (and in index WAL records) always just stores an array of
2 byte OffsetNumber elements. It probably wouldn't be all that
difficult to come up with a simple scheme for compressing an array of
OffsetNumbers in WAL records. It certainly doesn't seem like it would
be all that difficult to get it down to 1 byte per offset number in
most cases (even greater improvements seem doable).

That could also be used for the xl_heap_freeze_page record type --
though only after this patch is committed. The patch makes the WAL
record use a simple array of page offset numbers, just like in
PRUNE/VACUUM records. That's another reason why the approach
implemented by the patch seems like "the natural approach" to me. It's
much closer to how heapam PRUNE records work (we have a variable
number of arrays of page offset numbers in both cases).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message João Paulo Labegalini de Carvalho 2022-09-21 22:04:09 Re: Query JITing with LLVM ORC
Previous Message Peter Eisentraut 2022-09-21 21:37:05 Re: Transparent column encryption