From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Subject: | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date: | 2023-12-07 03:27:00 |
Message-ID: | CANWCAZadN-sQWBNAPm884nbYBd2JozE=wpDK9gn=H6zj8g7XCA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 27, 2023 at 1:45 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Oct 28, 2023 at 5:56 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> bool
> RT_SET(RT_RADIX_TREE *tree, uint64 key, RT_VALUE_TYPE *value_p);
> or for variable-length value support,
> RT_SET(RT_RADIX_TREE *tree, uint64 key, RT_VALUE_TYPE *value_p, size_t sz);
>
> If an entry already exists, update its value to 'value_p' and return
> true. Otherwise set the value and return false.
> RT_VALUE_TYPE
> RT_INSERT(RT_RADIX_TREE *tree, uint64 key, size_t sz, bool *found);
>
> If the entry already exists, replace the value with a new empty value
> with sz bytes and set "found" to true. Otherwise, insert an empty
> value, return its pointer, and set "found" to false.
>
> We probably will find a better name but I use RT_INSERT() for
> discussion. RT_INSERT() returns an empty slot regardless of existing
> values. It can be used to insert a new value or to replace the value
> with a larger value.
Looking at TidStoreSetBlockOffsets again (in particular how it works
with RT_GET), and thinking about issues we've discussed, I think
RT_SET is sufficient for vacuum. Here's how it could work:
TidStoreSetBlockOffsets could have a stack variable that's "almost
always" large enough. When not, it can allocate in its own context. It
sets the necessary bits there. Then, it passes the pointer to RT_SET
with the number of bytes to copy. That seems very simple.
At some future time, we can add a new function with the complex
business about getting the current value to modify it, with the
re-alloc'ing that it might require.
In other words, from both an API perspective and a performance
perspective, it makes sense for tid store to have a simple "set"
interface for vacuum that can be optimized for its characteristics
(insert only, ordered offsets). And also a more complex one for bitmap
scan (setting/unsetting bits of existing values, in any order). They
can share the same iteration interface, key types, and value types.
What do you think, Masahiko?
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-12-07 03:57:15 | Re: pg16 && GSSAPI && Heimdal/Macos |
Previous Message | Japin Li | 2023-12-07 03:25:16 | Re: Transaction timeout |