| From: | Martín Fernández <fmartin91(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | pgsql-general(at)postgresql(dot)org |
| Subject: | Re: Reindex "locked" standby database |
| Date: | 2021-12-15 04:10:45 |
| Message-ID: | 8A21C35C-EE30-445A-933A-354603824D54@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Micheal,
Thanks for much for the quick response.
> On 15 Dec 2021, at 00:37, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Dec 15, 2021 at 12:15:27AM -0300, Martín Fernández wrote:
>> The reindex went fine in the primary database and in one of our
>> standby. The other standby that we also operate for some reason
>> ended up in a state where all transactions were locked by the WAL
>> process and the WAL process was not able to make any progress. In
>> order to solve this issue we had to move traffic from the “bad”
>> standby to the healthy one and then kill all transactions that were
>> running in the “bad” standby. After that, replication was able to
>> resume successfully.
>
> You are referring to the startup process that replays WAL, right?
That is correct, I’m talking about the startup process that replays the WAL files.
> Without having an idea about the type of workload your primary and/or
> standbys are facing, as well as an idea of the configuration you are
> using on both (hot_standby_feedback for one), I have no direct idea,
Primary handle IOT data ingestion. The table that we had to REINDEX gets updated every time a new message arrives in the system so updated are happening very often on that table, thus, the index/table bloat. The standby at any point in time would be receiving queries that would take advantage of the indexes that were being re indexed. hot_standby_feedback is currently turned OFF on the standbys.
> but that could be a conflict caused by a concurrent vacuum.
>
> Seeing where things got stuck could also be useful, perhaps with a
> backtrace of the area where it happens and some information around
> it.
>
>> I’m just trying to understand what could have caused this issue. I
>> was not able to identify any queries in the standby that would be
>> locking the WAL process. Any insight would be more than welcome!
>
> That's not going to be easy without more information, I am afraid.
> --
> Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Martín Fernández | 2021-12-15 04:20:51 | Re: Reindex "locked" standby database |
| Previous Message | Mladen Gogala | 2021-12-15 03:52:36 | Re: Reindex "locked" standby database |