Re: Improve the efficiency of _bt_killitems.

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: feichanghong <feichanghong(at)qq(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, hu_yajun(at)qq(dot)com
Subject: Re: Improve the efficiency of _bt_killitems.
Date: 2024-11-01 10:50:40
Message-ID: e0aa1bbd-7aea-47d9-ae55-df55fda034c3@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/11/2024 10:41, feichanghong wrote:
>
>
>> On Nov 1, 2024, at 16:24, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>> On 01/11/2024 09:19, feichanghong wrote:
>>> Hi hackers,
>>> In the _bt_killitems function, the following logic is present: we
>>> search to the right for an index item that matches the heap TID and
>>> attempt to mark it as dead. If that index item has already been
>>> marked as dead by other concurrent processes, we will continue
>>> searching. However, there should not be any more matching index items
>>> on the current page.
>> Why could there not be more matching items on the page?
>>
>> Are you assuming a unique index? Even then it's not right; you can
>> have multiple index entries point to different heap tuples with the
>> same key, as long as they're not visible at the same time. For
>> example, if you UPDATE or DELETE+INSERT a row.
>
> Maybe I didn't describe it clearly. What I meant to say is that there
> shouldn't
> be multiple index items on the current page pointing to the same heap
> TID(ctid).
> rather than the same index key.

Ah, gotcha. That makes sense.

There's more we could do here in the similar vein:

If the TID was found in a posting list, but not all of the items were
dead, 'killtuple' is set to false so we will still continue the search.

Then there's this comment:

> /*
> * Don't bother advancing the outermost loop's int iterator to
> * avoid processing killed items that relate to the same
> * offnum/posting list tuple. This micro-optimization hardly
> * seems worth it. (Further iterations of the outermost loop
> * will fail to match on this same posting list's first heap
> * TID instead, so we'll advance to the next offnum/index
> * tuple pretty quickly.)
> */

I think we should do that micro-optimization. It's a single line of code
and a single instruction to advance 'i' variable AFAICS. While I agree
with the comment that it doesn't matter much performance-wise, it seems
simpler to just do it than explain why we don't do it.

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Kirill Reshke 2024-11-01 10:43:55 Add missing tab completion for ALTER TABLE ADD COLUMN IF NOT EXISTS