From: | Jerry Sievers <gsievers19(at)comcast(dot)net> |
---|---|
To: | Joe Van Dyk <joe(at)tanga(dot)com> |
Cc: | "pgsql-general\(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Replication failed after stalling |
Date: | 2013-12-18 20:33:05 |
Message-ID: | 86ppot527y.fsf@jerry.enova.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Joe Van Dyk <joe(at)tanga(dot)com> writes:
> I'm running Postgresql 9.3. I have a streaming replication server. Someone was running a long COPY query (8 hours) on the standby which halted replication. The
> replication stopped at 3:30 am. I canceled the long-running query at 9:30 am and replication data started catching up.
>
> The data up until 10 am got restored fine (took until 10:30 am to restore that much). Then I started getting errors like "FATAL: Â could not receive data from WAL
> stream: ERROR: Â requested WAL segment 00000001000003C300000086 has already been removed".
>
> I'm confused about how pg could restore data from 3:30 am to 10 am, then start complaining about missing WAL files.
>
> What's the best way to avoid this problem? Increase wal_keep_segments?
Yes and/or implement as a hybrid of streaming and WAL shipping.
Quite simply, your wal_keep segments was almost enough to get you
through that backlog period but as your standby was catching up, it hit
a point where there was a gap.
Depending on how much traffic your master sees at various times of the
day, it's unsurprising that during peak loads, your grace-period is a
lot lower than during off-peak times due to variations in how quickly
WAL segments are filled and cycled over.
HTH
>
> Joe
>
--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres(dot)consulting(at)comcast(dot)net
p: 312.241.7800
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2013-12-18 20:51:35 | Re: Replication failed after stalling |
Previous Message | John R Pierce | 2013-12-18 20:16:08 | Re: Multi Master Replication |