Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2024-08-08 13:53:00
Message-ID: CANtu0ohFr7OzNSbxqBhUpR0mXDYyt0Xt6+=Tbq0EC7as7kr+Lg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, Matthias!

> While waiting for this, here are some initial comments on the github
diffs:

Thanks for your review!
While stress testing the POC, I found some issues unrelated to the patch
that need to be fixed first.
This is [1] and [2].

>> Additional index is lightweight and does not produce any WAL.
> That doesn't seem to be what I see in the current patchset:

Persistence is passed as parameter [3] and set to RELPERSISTENCE_UNLOGGED
for auxiliary indexes [4].

> - I notice you've added a new argument to
> heapam_index_build_range_scan. I think this could just as well be
> implemented by reading the indexInfo->ii_Concurrent field, as the
> values should be equivalent, right?

Not always; currently, it is set by ResetSnapshotsAllowed[5].
We fall back to regular index build if there is a predicate or expression
in the index (which should be considered "safe" according to [6]).
However, we may remove this check later.
Additionally, there is no sense in resetting the snapshot if we already
have an xmin assigned to the backend for some reason.

> In heapam_index_build_range_scan, it seems like you're popping the
> snapshot and registering a new one while holding a tuple from
> heap_getnext(), thus while holding a page lock. I'm not so sure that's
> OK, expecially when catalogs are also involved (specifically for
> expression indexes, where functions could potentially be updated or
> dropped if we re-create the visibility snapshot)

Yeah, good catch.
Initially, I implemented a different approach by extracting the catalog
xmin to a separate horizon [7]. It might be better to return to this option.

> In heapam_index_build_range_scan, you pop the snapshot before the
> returned heaptuple is processed and passed to the index-provided
> callback. I think that's incorrect, as it'll change the visibility of
> the returned tuple before it's passed to the index's callback. I think
> the snapshot manipulation is best added at the end of the loop, if we
> add it at all in that function.

Yes, this needs to be fixed as well.

> The snapshot reset interval is quite high, at 500ms. Why did you
> configure it that low, and didn't you make this configurable?

It is just a random value for testing purposes.
I don't think there is a need to make it configurable.
Getting a new snapshot is a cheap operation now, so we can do it more often
if required.
Internally, I was testing it with a 0ms interval.

> You seem to be using WAL in the STIR index, while it doesn't seem
> that relevant for the use case of auxiliary indexes that won't return
> any data and are only used on the primary. It would imply that the
> data is being sent to replicas and more data being written than
> strictly necessary, which to me seems wasteful.

It just looks like an index with WAL, but as mentioned above, it is
unlogged in actual usage.

> The locking in stirinsert can probably be improved significantly if
> we use things like atomic operations on STIR pages. We'd need an
> exclusive lock only for page initialization, while share locks are
> enough if the page's data is modified without WAL. That should improve
> concurrent insert performance significantly, as it would further
> reduce the length of the exclusively locked hot path.

Hm, good idea. I'll check it later.

Best regards & thanks again,
Mikhail

[1]:
https://www.postgresql.org/message-id/CANtu0ohHmYXsK5bxU9Thcq1FbELLAk0S2Zap0r8AnU3OTmcCOA%40mail.gmail.com
[2]:
https://www.postgresql.org/message-id/CANtu0ojga8s9%2BJ89cAgLzn2e-bQgy3L0iQCKaCnTL%3Dppot%3Dqhw%40mail.gmail.com
[3]:
https://github.com/postgres/postgres/compare/master...michail-nikolaev:postgres:new_index_concurrently_approach#diff-50abc48efcc362f0d3194aceba6969429f46fa1f07a119e555255545e6655933R93
[4]:
https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/catalog/index.c#L1600
[5]:
https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/catalog/index.c#L2657
[6]:
https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/commands/indexcmds.c#L1129
[7]:
https://github.com/postgres/postgres/commit/38b243d6cc7358a44cb1a865b919bf9633825b0c

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2024-08-08 13:55:53 Re: First draft of PG 17 release notes
Previous Message Daniel Gustafsson 2024-08-08 13:01:57 Re: Enable data checksums by default