Quick Links

Re: very long secondary->primary switch time

From:	Tomas Pospisek <tpo2(at)sourcepole(dot)ch>
To:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: very long secondary->primary switch time
Date:	2021-04-28 06:03:56
Message-ID:	3cc70134-f869-05f7-f970-30583fa859b6@sourcepole.ch
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi Tom, hi list participants,

thanks a lot for replying Tom.

On 27.04.21 22:16, Tom Lane wrote:
> Tomas Pospisek <tpo2(at)sourcepole(dot)ch> writes:
>> I maintain a postgresql cluster that does failover via patroni. The
>> problem is that after a failover happens it takes the secondary too long
>> (that is about 35min) to come up and answer queries. The log of the
>> secondary looks like this:
>
>> 04:00:29.777 [9679] LOG: received promote request
>> 04:00:29.780 [9693] FATAL: terminating walreceiver process due to
>> administrator command
>> 04:00:29.780 [9679] LOG: invalid record length at 320/B95A1EE0: wanted
>> 24, got 0
>> 04:00:29.783 [9679] LOG: redo done at 320/B95A1EA8
>> 04:00:29.783 [9679] LOG: last completed transaction was at log time
>> 2021-03-03 03:57:46.466342+01
>
>> 04:35:00.982 [9679] LOG: selected new timeline ID: 15
>> 04:35:01.404 [9679] LOG: archive recovery complete
>> 04:35:02.337 [9662] LOG: database system is ready to accept connections
>
>> The cluster is "fairly large" with thousands of DBs (sic!) and ~1TB of data.
>
> Hm. WAL replay is already done at the "redo done" entry. There is a
> checkpoint after that, I believe, and there may be some effort to search
> for dead files as well. Still, if your I/O subsystem is better than
> a wet noodle, 35 minutes is a long time to finish that.
>
> One thing I'm not sure about is whether we try to do the checkpoint
> at maximum speed. If you have set GUC options to throttle checkpoint
> I/O hard, that could perhaps explain this.

I didn't do much customization of postgresql settings and am mostly
running what was provided by the upstreams. These are my explicit
settings (all the rest is defaults):

allow_system_table_mods: 'off'
autovacuum: 'off'
checkpoint_completion_target: '0.7'
effective_cache_size: 16 GB
maintenance_work_mem: 1 GB
max_connections: 200
max_wal_size: 16 GB
shared_buffers: 8 GB
shared_preload_libraries: pg_stat_statements
track_activities: 'on'
track_activity_query_size: 32 kB
track_functions: all
wal_keep_segments: 100
work_mem: 64 MB

> You could possibly learn more by strace'ing the startup process to
> see what it's doing.

Will do, thanks, however I'm dreading the next failover downtime :-(

> Also, what PG version is that exactly?

12.6-1.pgdg20.04+1

Thankful for any help or pointers regarding the long promotion time,
*t

In response to

Re: very long secondary->primary switch time at 2021-04-27 20:16:01 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Laurenz Albe	2021-04-28 06:51:33	Re: pgbouncer configuration
Previous Message	Chris Stephens	2021-04-28 00:46:38	pgbouncer configuration