Re: Standby Mechanics: WAL vs Streaming

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Don Seiler <don(at)seiler(dot)us>
Cc: MichaelDBA <MichaelDBA(at)sqlexec(dot)com>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Standby Mechanics: WAL vs Streaming
Date: 2017-08-27 06:40:44
Message-ID: CAECtzeWXCnLkbwZyfDg7=nUJTe8WY0UQQqkayvz0GxWN-dqb5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

2017-08-27 5:16 GMT+02:00 Don Seiler <don(at)seiler(dot)us>:

> On Sat, Aug 26, 2017 at 9:42 PM, MichaelDBA <MichaelDBA(at)sqlexec(dot)com>
> wrote:
>
>> pgbackrest archives WALs faster than a custom archive_command command. I
>> use pgbackrest to automate archiving of WALs from primary to pgbackrest
>> server. From there, my restore_command on the slave pulls WALs if needed
>> to catch up to SR (streaming replication).
>>
>
> Interesting, I'll look into pgbackrest. Thanks!
>
>
>> The standby will recover from the streaming xlog data if local WAL files
>> are not available. (So the standby prefers local WAL files to streaming?)
>> The other way around: The standby prefers streaming replication. It only
>> pulls WALs (restore_command) if it is too far behind.
>>
>
> OK. That's what I had though initially but for some reason convinced
> myself that it would think it's easier to get it locally when available.
>
>
The way PostgreSQL does it is a lot easier.

When you start the standby, it first looks for archived WAL to replay. When
there's none left, it tries to connect to the primary. Once connected, it
asks for a specific WAL location. If the primary has it, the streaming
starts. It stays that way (using the streaming) as long as it can. If
there's an error, for example the primary had a lot of writing to do, and
recycle the WAL too quickly, the WAL information may have been gone before
being sent to the WAL. In such a case, the primary and the standby will
start displaying on the logs that kind of messages: FATAL: requested WAL
segment XXXX has already been removed. When this happens, the streaming
connection between the primary and the standby is dropped, and the standby
tries to replay the available archived WAL, according the ones it needs.
When it finally replays them all, it tries back to connect to the primary
(which hopefully will now have the WAL it needs).

So, that's kind of simple:
* first replaying archived WAL
* when no more archived WAL, streaming
* when error, back to WAL replay

>
>> I saw that the WAL archives were being removed from the archive
>> directory. If the standby was already ahead of the WAL files because of
>> streaming replication, would it still use the restore_command to restore
>> WAL files from archive to px_log directory? Or would it just delete the
>> archive files?
>> The archived files in your case (stored on the standby) would only be
>> removed if you had pg_archivecleanup command also in your recovery.conf
>> file for WAL cleanup (archive_cleanup_command).
>>
>
> Yes we do have that, sorry I didn't specify. I just wanted to confirm that
> it would be used to clean up archives that arrive well after their
> usefulness with streaming replication.
>
>
>
>> As far as the lag goes, what can we do to increase the throughput for
>> streaming replication? We tried to give it almost all of our 100Mb pipe
>> between data centers and there was still significant lag until it recovered
>> past the point of my manual vacuums.
>> What do you expect? Manual vacuum full on big tables will generate a lot
>> of WAL thereby causing SR lag.
>>
>
> As I've said in other recent emails to this list, I'm relatively new to
> these waters, having spent the last 15+ years as an Oracle DBA. I figured
> that manual vacuum would generate a lot of changes for replication. That's
> why I'm asking where I can improve the throughput (if possible), given the
> additional activity spike.
>
>
>> I'm going to assume that having both streaming replication and WAL
>> rsyncing are going to compete for bandwidth. Would it be advisable to only
>> do streaming most of the day and then maybe rsync WAL files in a nightly
>> batch? Now that I think of it, we do hourly snapmirrors of that WAL storage
>> already (also using the same bandwidth), perhaps we just use that for the
>> WAL replication.
>> How about doing your manual vacuums at night time during low load times?
>>
>
> You're assuming that night time is our off-peak time (although I did say
> "nightly batch", it doesn't mean it's a better time for maintenance work in
> our case). There's a reason I was running the vacuum when I was this
> morning, and it isn't something we regularly do. We had a table with 3
> corrupt indexes that had prevented autovacuum from running on that table
> for many months. Today I recreated those indexes and ran a manual vac
> freeze (it was well past the min freeze age), during an off-peak time
> negotiated with the business side of the shop.
>
> Thanks again for the pg_backrest information. I'll check it out. I'm
> thinking at the very least we could just present that NFS snapmirror as a
> read-only volume to the standby and point restore_command at it. That
> snapmirror is taking place regardless, so the additional rsync we're doing
> in our archive_command script on the primary is just piling on and eating
> up more bandwidth.
>
>

--
Guillaume.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Don Seiler 2017-08-28 04:52:28 Re: Standby Mechanics: WAL vs Streaming
Previous Message Don Seiler 2017-08-27 03:16:11 Re: Standby Mechanics: WAL vs Streaming