From: | Rory Falloon <rfalloon(at)gmail(dot)com> |
---|---|
To: | andres(at)anarazel(dot)de |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Dealing with latency to replication slave; what to do? |
Date: | 2018-07-24 20:08:38 |
Message-ID: | CANP_6+NRfinHdxixJz3YoxAgc6oTk8=OdCGce_m-EjF8=emH+Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Andres,
regarding your first reply, I was inferring that from the fact I saw those
messages at the same time the replication stream fell behind. What other
logs would be more pertinent to this situation?
On Tue, Jul 24, 2018 at 4:02 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> On 2018-07-24 15:39:32 -0400, Rory Falloon wrote:
> > Looking for any tips here on how to best maintain a replication slave
> which
> > is operating under some latency between networks - around 230ms. On a
> good
> > day/week, replication will keep up for a number of days, but however,
> when
> > the link is under higher than average usage, keeping replication active
> can
> > last merely minutes before falling behind again.
> >
> > 2018-07-24 18:46:14 GMTLOG: database system is ready to accept read only
> > connections
> > 2018-07-24 18:46:15 GMTLOG: started streaming WAL from primary at
> > 2B/93000000 on timeline 1
> > 2018-07-24 18:59:28 GMTLOG: incomplete startup packet
> > 2018-07-24 19:15:36 GMTLOG: incomplete startup packet
> > 2018-07-24 19:15:36 GMTLOG: incomplete startup packet
> > 2018-07-24 19:15:37 GMTLOG: incomplete startup packet
> >
> > As you can see above, it lasted about half an hour before falling out of
> > sync.
>
> How can we see that from the above? The "incomplete startup messages"
> are independent of streaming rep? I think you need to show us more logs.
>
>
> > On the master, I have wal_keep_segments=128. What is happening when I see
> > "incomplete startup packet" - is it simply the slave has fallen behind,
> > and cannot 'catch up' using the wal segments quick enough? I assume the
> > slave is using the wal segments to replay changes and assuming there are
> > enough wal segments to cover the period it cannot stream properly, it
> will
> > eventually recover?
>
> You might want to look into replication slots to ensure the primary
> keeps the necessary segments, but not more, around. You might also want
> to look at wal_compression, to reduce the bandwidth usage.
>
> Greetings,
>
> Andres Freund
>
From | Date | Subject | |
---|---|---|---|
Next Message | Christophe Pettus | 2018-07-24 20:09:29 | Re: width_bucket issue |
Previous Message | Raphaël Berbain | 2018-07-24 20:02:41 | width_bucket issue |