Quick Links

Re: improve wals replay on secondary

From:	Fabio Pardi <f(dot)pardi(at)portavita(dot)eu>
To:	Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
Cc:	pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: improve wals replay on secondary
Date:	2019-05-27 11:04:51
Message-ID:	caeafb3f-7e91-eacf-f22e-dcfa6a1ad5bf@portavita.eu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

If you did not even see this messages on your standby logs:

restartpoint starting: xlog

then it means that the checkpoint was even never started.

In that case, I have no clue.

Try to describe step by step how to reproduce the problem together with
your setup and the version number of Postgres and repmgr, and i might be
able to help you further.

regards,

fabio pardi

On 5/27/19 12:17 PM, Mariel Cherkassky wrote:
> standby_mode = 'on'
> primary_conninfo = 'host=X.X.X.X user=repmgr connect_timeout=10 '
> recovery_target_timeline = 'latest'
> primary_slot_name = repmgr_slot_1
> restore_command = 'rsync -avzhe ssh
> postgres(at)x(dot)x(dot)x(dot)x:/var/lib/pgsql/archive/%f /var/lib/pgsql/archive/%f ;
> gunzip < /var/lib/pgsql/archive/%f > %p'
> archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup
> /var/lib/pgsql/archive %r'
>
> ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:29 מאת ‪Fabio Pardi‬‏
> <‪f(dot)pardi(at)portavita(dot)eu <mailto:f(dot)pardi(at)portavita(dot)eu>‬‏>:‬
>
> Hi Mariel,
>
> let s keep the list in cc...
>
> settings look ok.
>
> what's in the recovery.conf file then?
>
> regards,
>
> fabio pardi
>
> On 5/27/19 11:23 AM, Mariel Cherkassky wrote:
> > Hey,
> > the configuration is the same as in the primary :
> > max_wal_size = 2GB
> > min_wal_size = 1GB
> > wal_buffers = 16MB
> > checkpoint_completion_target = 0.9
> > checkpoint_timeout = 30min
> >
> > Regarding your question, I didnt see this message (consistent recovery
> > state reached at), I guess thats why the secondary isnt avaialble
> yet..
> >
> > Maybe I'm wrong, but what I understood from the documentation- restart
> > point is generated only after the secondary had a checkpoint wihch
> means
> > only after 30 minutes or after max_wal_size is reached ? But
> still, why
> > wont the secondary reach a consisteny recovery state (does it
> requires a
> > restart point to be generated ? )
> >
> >
> > ‫בתאריך יום ב׳, 27 במאי 2019 ב-12:12 מאת ‪Fabio Pardi‬‏
> > <‪f(dot)pardi(at)portavita(dot)eu <mailto:f(dot)pardi(at)portavita(dot)eu>
> <mailto:f(dot)pardi(at)portavita(dot)eu <mailto:f(dot)pardi(at)portavita(dot)eu>>‬‏>:‬
> >
> > Hi Mariel,
> >
> > if i m not wrong, on the secondary you will see the messages you
> > mentioned when a checkpoint happens.
> >
> > What are checkpoint_timeout and max_wal_size on your standby?
> >
> > Did you ever see this on your standby log?
> >
> > "consistent recovery state reached at .."
> >
> >
> > Maybe you can post your whole configuration of your standby
> for easier
> > debug.
> >
> > regards,
> >
> > fabio pardi
> >
> >
> >
> >
> > On 5/27/19 10:49 AM, Mariel Cherkassky wrote:
> > > Hey,
> > > PG 9.6, I have a standalone configured. I tried to start up a
> > secondary,
> > > run standby clone (repmgr). The clone process took 3 hours
> and during
> > > that time wals were generated(mostly because of the
> > checkpoint_timeout).
> > > As a result of that, when I start the secondary ,I see that the
> > > secondary keeps getting the wals but I dont see any messages
> that
> > > indicate that the secondary tried to replay the wals.
> > > messages that i see :
> > > receiving incremental file list
> > > 000000010000377B000000DE
> > >
> > > sent 30 bytes received 4.11M bytes 8.22M bytes/sec
> > > total size is 4.15M speedup is 1.01
> > > 2019-05-22 12:48:10 EEST 60942 LOG: restored log file
> > > "000000010000377B000000DE" from archive
> > > 2019-05-22 12:48:11 EEST db63311 FATAL: the database system is
> > starting up
> > > 2019-05-22 12:48:12 EEST db63313 FATAL: the database system is
> > > starting up
> > >
> > > I was hoping to see the following messages (taken from a
> different
> > > machine) :
> > > 2019-05-27 01:15:37 EDT 7428 LOG: restartpoint starting: time
> > > 2019-05-27 01:16:18 EDT 7428 LOG: restartpoint complete:
> wrote 406
> > > buffers (0.2%); 1 transaction log file(s) added, 0 removed, 0
> > recycled;
> > > write=41.390 s, sync=0.001 s, total=41.582 s; sync file
> > > s=128, longest=0.000 s, average=0.000 s; distance=2005 kB,
> > estimate=2699 kB
> > > 2019-05-27 01:16:18 EDT 7428 LOG: recovery restart point at
> > 4/D096C4F8
> > >
> > > My primary settings(wals settings) :
> > > wal_buffers = 16MB
> > > checkpoint_completion_target = 0.9
> > > checkpoint_timeout = 30min
> > >
> > > Any idea what can explain why the secondary doesnt replay
> the wals ?
> >
> >
>

In response to

Re: improve wals replay on secondary at 2019-05-27 10:17:49 from Mariel Cherkassky

Responses

Re: improve wals replay on secondary at 2019-05-28 10:54:51 from Fabio Pardi

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Fabio Pardi	2019-05-28 10:54:51	Re: improve wals replay on secondary
Previous Message	Mariel Cherkassky	2019-05-27 10:17:49	Re: improve wals replay on secondary