Re: WIP: parallel GiST index builds

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP: parallel GiST index builds
Date: 2024-07-26 09:30:12
Message-ID: 6478355b-19b9-446a-8175-c8f4a551b50e@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/22/24 2:08 PM, Andrey M. Borodin wrote:
> During inserting tuples we need NSN on page. For NSN we can use just a counter, generated by gistGetFakeLSN() which in turn will call GetFakeLSNForUnloggedRel(). Or any other shared counter.
> After inserting tuples we call log_newpage_range() to actually WAL-log pages.
> All NSNs used during build must be less than LSNs used to insert new tuples after index is built.

I feel the tricky part about doing that is that we need to make sure the
fake LSNs are all less than the current real LSN when the index build
completes and while that normally should be the case we will have a
almost never exercised code path for when the fake LSN becomes bigger
than the real LSN which may contain bugs. Is that really worth it to
optimize.

But if we are going to use fake LSN: since the index being built is not
visible to any scans we do not have to use GetFakeLSNForUnloggedRel()
but could use an own counter in shared memory in the GISTShared struct
for this specific index which starts at FirstNormalUnloggedLSN. This
would give us slightly less contention plus decrease the risk (for good
and bad) of the fake LSN being larger than the real LSN.

Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nisha Moond 2024-07-26 09:33:44 Re: Conflict detection and logging in logical replication
Previous Message Alexander Kuznetsov 2024-07-26 09:16:00 Re: Possible null pointer dereference in afterTriggerAddEvent()