Quick Links

Re: Expanding HOT updates for expression and partial indexes

From:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To:	"Burd, Greg" <gregburd(at)amazon(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Expanding HOT updates for expression and partial indexes
Date:	2025-03-05 22:56:42
Message-ID:	CAEze2WjUjf2yB48vB9RP-C=ZKibO2XPEb4+zTj3pPJZOa9M+Ng@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Sorry for the delay. This is a reply for the mail thread up to 17 Feb,
so it might be very out-of-date by now, in which case sorry for the
noise.

On Mon, 17 Feb 2025 at 20:54, Burd, Greg <gregburd(at)amazon(dot)com> wrote:
> On Feb 15, 2025, at 5:49 AM, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
> >
> > In HEAD, we have a clear indication of which classes of indexes to
> > update, with TU_UpdateIndexes. With this patch, we have to derive that
> > from the (lack of) bits in the bitmap that might be output by the
> > table_update procedure.
>
> Yes, but... that "clear indication" is lacking the ability to convey more detailed information. It doesn't tell you which summarizing indexes really need updating just that as a result of being on the HOT path all summarizing indexes require updates.

Agreed that it's not great if you want to know about which indexes
were meaningfully updated. I think that barring significant advances
in performance of update checks, we can devise a way of transfering
this info to the table_tuple_update caller once we get a need for more
detailed information (e.g. this could be transfered through the
IndexInfo* that's currently also used by index_unchanged_by_update).

> > I think we can do with an additional parameter for which indexes would
> > be updated (or store that info in the parameter which also will hold
> > EState et al). I think it's cheaper that way, too - only when
> > update_indexes could be TU_SUMMARIZING we might need the exact
> > information for which indexes to insert new tuples into, and it only
> > really needs to be sized to the number of summarizing indexes (usually
> > small/nonexistent, but potentially huge).
>
> Okay, yes with this patch we need only concern ourselves with all, none, or some subset of summarizing as before. I'll work on the opaque parameter next iteration.

Thanks!

> > -----
> >
> > I notice that ExecIndexesRequiringUpdates() does work on all indexes,
> > rather than just indexes relevant to this exact phase of checking. I
> > think that is a waste of time, so if we sort the indexes in order of
> > [hotblocking without expressions, hotblocking with expressions,
> > summarizing], then (with stored start/end indexes) we can save time in
> > cases where there are comparatively few of the types we're not going
> > to look at.
>
> If I plan on just having ExecIndexesRequiringUpdates() return a bool rather than a bitmap then sorting, or even just filtering the list of IndexInfo to only include indexes with expressions, makes sense. That way the only indexes in question in that function's loop will be those that may spoil the HOT path. When that list is length 0, we can skip the tests entirely.

Yes, exactly. Though I'm not sure it should hit that path if the
length is 0, as would mean we had expression indexes that matched the
updated columns, but somehow none are in the list?

> > You're extracting type info from the opclass, to use in
> > datum_image_eq(). Couldn't you instead use the index relation's
> > TupleDesc and its stored attribute information instead? That saves us
> > from having to do further catalog lookups during execution. I'm also
> > fairly sure that that information is supposed to be a more accurate
> > representation of attributes' expression output types than the
> > opclass' type information (though, they probably should match).
>
> I hadn't thought of that, I think it's a valid idea and I'll update accordingly. I think I understand what you are suggesting.

Thanks, that change was exactly what I meant.

> >
> > -----
> >
> > The operations applied in ExecIndexesRequiringUpdates partially
> > duplicate those done in index_unchanged_by_update. Can we (partially)
> > unify this, and pass which indexes were updated through the IndexInfo,
> > rather than the current bitmap?
>
> I think I do that now, feel free to say otherwise. When the expression is checked in ExecIndexesExpressionsWereNotUpdated() I set:
>
> /* Shortcut index_unchanged_by_update(), we know the answer. */ indexInfo->ii_CheckedUnchanged = true; indexInfo->ii_IndexUnchanged = !changed;
>
> That prevents duplicate effort in index_unchanged_by_update().

Exactly, yes.

> > -----
> >
> > I don't see a good reason to add IndexInfo to Relation, by way of
> > rd_indexInfoList. It seems like an ad-hoc way of passing data around,
> > and I don't think that's the right way.
>
> At one point I'd created a way to get this set via relcache, I will resurrect that approach but I'm not sure it is what you were hinting at.

AFAIK, we don't have IndexInfo in the relcaches currently. I'm very
hesitant to add an executor node (!) subtype to catalog caches, as
IndexInfos are also used to store temporary information about e.g.
index tuple insertion state, which (if IndexInfo is stored in
relcaches) would imply modifying relcache entries without any further
locks, and I'm not sure that's at all an OK thing to do.

> The current method avoids pulling a the lock on the index to build the list, but doing that once in relcache isn't horrible. Maybe you were suggesting using that opaque struct to pass around the list of IndexInfo? Let me know on this one if you had a specific idea. The swap I've made in v6 really just moves the IndexInfo list to a filtered list with a new name created in relcache.

My main concern is the storage of executor nodes(!) directly in the
relcache. I don't think we need that: We have relatively direct access
to the right IndexInfo** in ResultRelInfo->ri_IndexRelationInfo, which
I think should be sufficient for this purpose. (The relevant RRI is
available in table_tuple_update caller ExecUpdateAct; and could be
passed down by ExecSimpleRelationUpdate to simple_table_tuple_update,
covering both (current, core) callers of table_tuple_update). That
would then be passed down to the TableAM using an opaque pointer type;
for example (names, file locations, exact layout all bikesheddable):

/* tableam.h */
/* exact definition somewhere else, in e.g. an executor_internal.h */
typedef struct TU_UpdateIndexData TU_UpdateIndexData;

table_tuple_update(..., TU_UpdateIndexData *idxupdate, ...)

/* executor.h */

TU_UpdateIndexes
UpdateDetermineChangedIndexes(TU_UpdateIndexData *idxupdate,
TableTupleSlot *old, TableTupleSlot *new, bitmap *changed_atts, ...);

/* executor_internal.h */
struct TU_UpdateIndexData
{
EState estate;
IndexInfo **idxinfos;
...
}

-----

Looking at your later comments about RRI in patch v8, I think that
would solve and clean up the way that you currently get access to the
RRI and thus index set.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

In response to

Re: Expanding HOT updates for expression and partial indexes at 2025-02-17 19:53:47 from Burd, Greg

Responses

Re: Expanding HOT updates for expression and partial indexes at 2025-03-06 12:22:58 from Burd, Greg

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2025-03-05 23:10:41	Re: Add contrib/pg_logicalsnapinspect
Previous Message	Quan Zongliang	2025-03-05 22:50:24	Re: Allow LISTEN on patterns