Re: Understanding streaming replication

From: Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Understanding streaming replication
Date: 2012-11-12 18:37:50
Message-ID: CAMnJ+BdcLbYsKMNZUiHjoggr5Rspz+Y174yXxOi31_ZgjW_53g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Nov 12, 2012 at 10:11 AM, Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>wrote:

>
> On Mon, Nov 12, 2012 at 1:36 AM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>wrote:
>
>> I'll try to answer the questions I can.
>>
>
> Thank you!
>
>
>> Pawel Veselov wrote:
>> > I've been struggling with understanding all the necessary pieces for
>> streaming replication. So I put
>> > down the pieces as I did understand them, and would appreciate if you
>> guys could point out any of the
>> > stuff I understood or have done wrong.
>> >
>> > The set up is pgpool + streaming replication + hot stand by. No load
>> balancing, stand-by nodes will
>> > not receive any application queries (I don't have that big of a query
>> load, and I don't want to risk
>> > inconsistent reads). There are no shared file systems, but there is a
>> way to rsync/scp files between
>> > nodes. Fail-over is automatic, and should kick in within reasonably
>> small period after master failure.
>> >
>> > 1. Archiving. Should be turned on on all the nodes. The archive command
>> should copy the archive file
>> > to the local archive directory, and rsync archive directory between all
>> the nodes. My understanding is
>> > that archiving is necessary if a stand-by node ever "missed" enough WAL
>> updates to need an old enough
>> > WAL that might have been removed from pg_xlog.
>> You don't give details about how the rsync is triggered,
>
> but I'd advise against having rsync as part of archive_command.
>> First, it is slow and if there is a lot of activity, the
>> archiver will not be able to keep up.
>> Second, if rsync fails, the WAL file will not be considered
>> archived.
>>
>> Both these things will keep the WAL files from being deleted
>> from pg_xlog.
>>
>> I'd schedule rsync as a cron job or similar.
>>
>
> From your later comments, it's also apparent that these archived WALs will
> be useless after failover (for the purpose of recovery), so there is no
> reason to send them to all the nodes after all.
>
>
I obviously lost it here. The archives do need to be synchronized, for the
purpose of recovering slaves. If a slave dies, and I want to recover it, it
may need the archived WALs, and for this, the archives should be available
on the node. So, rsync (or something like that) is necessary. But it's a
bad idea to run the rsync from the archive command itself.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lists 2012-11-12 18:38:12 Re: Unexpectedly high disk space usage RESOLVED (Manual reindex/vacuum)
Previous Message Pawel Veselov 2012-11-12 18:11:11 Re: Understanding streaming replication