From: | Radoslav Nedyalkov <rnedyalkov(at)gmail(dot)com> |
---|---|
To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: conflict with recovery when delay is gone |
Date: | 2020-11-13 18:13:42 |
Message-ID: | CANhtRiY30QiOWn1AFgiiAaAiwyaAip73fzz=vGZeCHeEV18Srg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Nov 13, 2020 at 7:37 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:
> On Fri, 2020-11-13 at 15:24 +0200, Radoslav Nedyalkov wrote:
> > On a very busy master-standby setup which runs typical olap processing -
> > long living , massive writes statements, we're getting on the standby:
> >
> > ERROR: canceling statement due to conflict with recovery
> > FATAL: terminating connection due to conflict with recovery
> >
> > The weird thing is that cancellations happen usually after standby has
> experienced
> > some huge delay(2h), still not at the allowed maximum(3h). Even recently
> run statements
> > got cancelled when the delay is already at zero.
> >
> > Sometimes the situation got relaxed after an hour or so.
> > Restarting the server instantly helps.
> >
> > It is pg11.8, centos7, hugepages, shared_buffers 196G from 748G.
> >
> > What phenomenon could we be facing?
>
> Hard to say. Perhaps an unusual kind of replication conflict?
>
> What is in "pg_stat_database_conflicts" on the standby server?
>
db01=# select * from pg_stat_database_conflicts;
datid | datname | confl_tablespace | confl_lock | confl_snapshot |
confl_bufferpin | confl_deadlock
-------+-----------+------------------+------------+----------------+-----------------+----------------
13877 | template0 | 0 | 0 | 0 |
0 | 0
16400 | template1 | 0 | 0 | 0 |
0 | 0
16402 | postgres | 0 | 0 | 0 |
0 | 0
16401 | db01 | 0 | 0 | 51 |
0 | 0
(4 rows)
On a freshly restarted standby we've just got similar behaviour after a 2
hours delay and a slow catch-up.
confl_snapshots is 51 and we have exactly the same number cancelled
statements.
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2020-11-13 18:14:53 | Re: Issue upgrading from 9.5 to 13 with pg_upgrade: "connection to database failed: FATAL: database "template1" does not exist" |
Previous Message | Magnus Hagander | 2020-11-13 18:10:51 | Re: Issue upgrading from 9.5 to 13 with pg_upgrade: "connection to database failed: FATAL: database "template1" does not exist" |