Re: Expanding HOT updates for expression and partial indexes

From: "Burd, Greg" <gregburd(at)amazon(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expanding HOT updates for expression and partial indexes
Date: 2025-02-13 18:46:18
Message-ID: EA519A89-282D-4535-88C6-C79588FF1DA8@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached find an updated patchset v5 that is an evolution of v4.

Changes v4 to v5 are:
* replaced GUC with table reloption called "expression_checks" (open to other name ideas)
* minimal documentation updates to README.HOT to address changes
* avoid, when possible, the expensive path that requires evaluating an estate using bitmaps
* determines the set of summarized indexes requiring updates, only updates those
* more tests in heap_hot_updates.sql (perhaps too many...)
* rebased to master, formatted, and make check-world passes

More comments in context below...

-greg

> On Feb 11, 2025, at 4:18 PM, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
>
>
> On Mon, 10 Feb 2025 at 20:11, Burd, Greg <gregburd(at)amazon(dot)com> wrote:
>>> On Feb 10, 2025, at 12:17 PM, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
>>>
>>>>
>>>> I have a few concerns with the patch, things I’d greatly appreciate your thoughts on:
>>>>
>>>> First, I pass an EState along the update path to enable running the checks in heapam, this works but leaves me feeling as if I violated separation of concerns. If there is a better way to do this let me know or if you think the cost of creating one in the execIndexing.c ExecIndexesRequiringUpdates() is okay that’s another possibility.
>>>
>>> I think that doesn't have to be bad.
>>
>> Meaning that the approach I’ve taken is okay with you?
>
> If you mean "passing EState down through the AM so we can check if we
> really need HOT updates", then Yes, that's OK. I don't think the basic
> idea (executing the projections to check for differences) is bad per
> se, however I do think we may need more design work on the exact shape
> of how ExecIndexesRequiringUpdates() receives its information (which
> includes the EState).
>
> E.g. we could pass an opaquely typed pointer that's passed from the
> executor state through the table_update method into this
> ExecIndexesRequiringUpdates(). That opaque struct would then contain
> the important information for index update state checking, so that the
> AM can't realistically break things without bypassing the separation
> of concerns, and doesn't have to know about any executor nodes.

I'm open to this idea and will attempt an implementation in v6, ideas welcome.

>>>> Third, there is overhead to this patch, it is no longer a single simple bitmap test to choose HOT or not in heap_update().
>>>
>>> Why can't it mostly be that simple in simple cases?
>>
>> It can remain that simple in the cases you mention. In relcache the hot blocking attributes are nearly the same, only the summarizing attributes are removed. The first test then in heap_update() is for overlap with the modified set. When there is none, the update will proceed on the HOT path.
>>
>> The presence of a summarizing index is determined in ExecIndexesRequiringUpdates() in execIndexing.c, so a slightly longer code path but not much new overhead.
>
> Yes, but that's only determined at an index-by-index level, rather
> than determined all at once, and that's real bad when you have
> hundreds of indexes to go through (however unlikely it might be, I've
> heard of cases where there are 1000s of indexes on one table). So, I
> prefer limiting any O(n_indexes) operations to only the most critical
> cases.

This makes sense, and I agree that avoiding O(n_indexes) operations is a good goal when possible.

>>> I mean, it's clear that "updated indexed column's value == non-HOT
>>> update". And that to determine whether an updated *projected* column's
>>> value (i.e., expression index column's value) was actually updated we
>>> need to calculate the previous and current index value, thus execute
>>> the projection twice. But why would we have significant additional
>>> overhead if there are no expression indexes, or when we can know by
>>> bitmap overlap that the only interesting cases are summarizing
>>> indexes?
>>
>> You’re right, there’s not a lot of new overhead in that case except what happens in ExecIndexesRequiringUpdates() to scan over the list of IndexInfo. It is really only when there are many expressions/predicates requiring examination that there is any significant cost to this approach AFAICT (but if you see something please point it out).
>
> See the attached approach. Evaluation of the expressions only has to
> happen if there are any HOT-blocking attributes which are exclusively
> hot-blockingly indexed through expressions, so if the updated
> attribute numbers are a subset of hotblockingexprattrs. (substitute
> hotblocking with summarizing for the summarizing approach)

I believe I've incorporated the gist of your idea in this v5 patch, let me know if I missed something.

>>> I would've implemented this with (1) two new bitmaps, one each for
>>> normal and summarizing indexes, each containing which columns are
>>> exclusively used in expression indexes (and which should thus be used
>>> to trigger the (comparatively) expensive recalculation).
>>
>> That was one where I started, over time that became harder to work as the bitmaps contain the union of index attributes for the table not per-column.
>
> I think it's fairly easy to create, though.
>
>> Now there is one bitmap to cover the broadest case and then a function to find the modified set of indexes where each is examined against bitmaps that contain only attributes specific to the index in question. This helped in cases where there were both expression and non-expression indexes on the same attribute.
>
> Fair, but do we care about one expression index on (attr1->>'data')'s
> value *not* changing when an index on (attr1) exists and attr1 has
> changed? That index on att1 would block HOT updates regardless of the
> (lack of) changes to the (att1->>'data') index, so doing those
> expensive calculations seems quite wasteful.

Agreed, when both a non-expression and an expression index exist on the same attribute then the expression checks are unnecessary and should be avoided. In this v5 patchset this case becomes two checks of bitmaps (first hot_attrs, then exclusively exp_attrs) before proceeding with a non-HOT update.

> So, in my opinion, we should also keep track of those attributes only
> included in expressions of indexes, and that's fairly easy: see
> attached prototype.diff.txt (might need some work, the patch was
> drafted on v16's codebase, but the idea is clear).

Thank you for your patch, I've included and expanded it.

> The resulting %exprattrs bitmap contains attributes that are used only
> in expressions of those index types.
>
>>> The "new" output of these expression
>>> evaluations would be stored to be used later as index datums, reducing
>>> the number of per-expression evaluations down to 2 at most, rather
>>> than 2+1 when the index needs an insertion but the expression itself
>>> wasn't updated.
>>
>> Not reforming the new index tuples is also an interesting optimization. I wonder how that can be passed from within heapam’s call into a function in execIndexing up into nodeModifiyTable and back down into execIndexing and on to the index access method? I’ll have to think about that, ideas welcome.
>
> Note that index tuple forming happens only in the index AM, it's the
> Datum construction (i.e. projection from attributes/tuple to indexed
> value) that I'd like to deduplicate. Though looking at the current
> code, I don't think it's reasonable to have that as a requirement for
> this work. It'd be a nice-to-have for sure, but not as requirement.

Agreed that it's a nice-to-have, but not a priority.

>>> Do you have any documentation on the approaches used, and the specific
>>> differences between v3 and v4? I don't see much of that in your
>>> initial mail, and the patches themselves also don't show much of that
>>> in their details. I'd like at least some documentation of the new
>>> behaviour in src/backend/access/heap/README.HOT at some point before
>>> this got marked as RFC in the commitfest app, though preferably sooner
>>> rather than later.
>>
>> Good point, I should have updated README.HOT with the initial patchset. I’ll jump on that and update ASAP.
>
> Thanks in advance.
>
> Kind regards,
>
> Matthias van de Meent
> Neon (https://neon.tech)
> <prototype.diff.txt>

Attachment Content-Type Size
v5-0001-Expand-HOT-update-path-to-include-expression-and-.patch application/octet-stream 97.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Степан 2025-02-13 18:47:37 Re: Patch: Log parameter types in detailed query logging
Previous Message Fabrízio de Royes Mello 2025-02-13 18:18:51 Re: Remove a unnecessary argument from execute_extension_script()