Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2023-12-07 03:27:00
Message-ID: CANWCAZadN-sQWBNAPm884nbYBd2JozE=wpDK9gn=H6zj8g7XCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 27, 2023 at 1:45 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Oct 28, 2023 at 5:56 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:

> bool
> RT_SET(RT_RADIX_TREE *tree, uint64 key, RT_VALUE_TYPE *value_p);
> or for variable-length value support,
> RT_SET(RT_RADIX_TREE *tree, uint64 key, RT_VALUE_TYPE *value_p, size_t sz);
>
> If an entry already exists, update its value to 'value_p' and return
> true. Otherwise set the value and return false.

> RT_VALUE_TYPE
> RT_INSERT(RT_RADIX_TREE *tree, uint64 key, size_t sz, bool *found);
>
> If the entry already exists, replace the value with a new empty value
> with sz bytes and set "found" to true. Otherwise, insert an empty
> value, return its pointer, and set "found" to false.
>
> We probably will find a better name but I use RT_INSERT() for
> discussion. RT_INSERT() returns an empty slot regardless of existing
> values. It can be used to insert a new value or to replace the value
> with a larger value.

Looking at TidStoreSetBlockOffsets again (in particular how it works
with RT_GET), and thinking about issues we've discussed, I think
RT_SET is sufficient for vacuum. Here's how it could work:

TidStoreSetBlockOffsets could have a stack variable that's "almost
always" large enough. When not, it can allocate in its own context. It
sets the necessary bits there. Then, it passes the pointer to RT_SET
with the number of bytes to copy. That seems very simple.

At some future time, we can add a new function with the complex
business about getting the current value to modify it, with the
re-alloc'ing that it might require.

In other words, from both an API perspective and a performance
perspective, it makes sense for tid store to have a simple "set"
interface for vacuum that can be optimized for its characteristics
(insert only, ordered offsets). And also a more complex one for bitmap
scan (setting/unsetting bits of existing values, in any order). They
can share the same iteration interface, key types, and value types.

What do you think, Masahiko?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-12-07 03:57:15 Re: pg16 && GSSAPI && Heimdal/Macos
Previous Message Japin Li 2023-12-07 03:25:16 Re: Transaction timeout