Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica

From: Ben Chobot <bench(at)silentmedia(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
Date: 2022-02-09 04:28:39
Message-ID: 07ad5e5b-2176-5c7d-dfa7-7f63231df22b@silentmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Michael Paquier wrote on 2/8/22 5:35 PM:
> On Tue, Feb 08, 2022 at 01:23:34PM -0800, Peter Geoghegan wrote:
>> I find the idea that we'd fail to WAL-log information that is needed
>> during Hot Standby (to prevent this race condition) plausible.
>> Michael?
> Yeah, REINDEX relies on some existing index definition, so it feels
> like we are missing a piece related to invalid indexes in all that. A
> main difference is the lock level, as exclusive locks are getting
> logged so the standby can react and wait on that. The 30-minute mark
> is interesting. Ben, did you change any replication-related GUCs that
> could influence that? Say, wal_receiver_timeout, hot_standby_feedback
> or max_standby_streaming_delay?

Oh, to be clear, the 30 minute mark is more "the loop has always failed
this far into it" and sometimes that's 5 minutes and sometimes it's
more, but I've never seen it take more than somewhere in the 20s. I was
thinking it was just because of the race condition, but, to answer your
question, yes, we have tuned some replication parameters. Here are the
ones you asked about; did you want to see the value of any others?

=# show wal_receiver_timeout ;
 wal_receiver_timeout
──────────────────────
 1min
(1 row)

04:26:27 db: postgres(at)postgres, pid:29507
=# show hot_standby_feedback ;
 hot_standby_feedback
──────────────────────
 on
(1 row)

04:26:40 db: postgres(at)postgres, pid:29507
=# show max_standby_streaming_delay ;
 max_standby_streaming_delay
─────────────────────────────
 10min
(1 row)

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Borodin 2022-02-09 05:25:11 Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
Previous Message Soni M 2022-02-09 03:58:13 Re: BUG #17399: Dead tuple number stats not updated on long running queries