From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | Aidan Van Dyk <aidan(at)highrise(dot)ca>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL |
Date: | 2010-03-18 14:27:59 |
Message-ID: | 3f0b79eb1003180727g7877743eq81274e014fe70a49@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-docs pgsql-hackers |
On Wed, Mar 17, 2010 at 7:35 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Fujii Masao wrote:
>> I found another missing feature in new file-based log shipping (i.e.,
>> standby_mode is enabled and 'cp' is used as restore_command).
>>
>> After the trigger file is found, the startup process with pg_standby
>> tries to replay all of the WAL files in both pg_xlog and the archive.
>> So, when the primary fails, if the latest WAL file in pg_xlog of the
>> primary can be read, we can prevent the data loss by copying it to
>> pg_xlog of the standby before creating the trigger file.
>>
>> On the other hand, the startup process with standby mode doesn't
>> replay the WAL files in pg_xlog after the trigger file is found. So
>> failover always causes the data loss even if the latest WAL file can
>> be read from the primary. And if the latest WAL file is copied to the
>> archive instead, it can be replayed but a PANIC error would happen
>> because it's not filled.
>>
>> We should remove this restriction?
>
> Looking into this, I realized that we have a bigger problem related to
> this. Although streaming replication stores the streamed WAL files in
> pg_xlog, so that they can be re-replayed after a standby restart without
> connecting to the master, we don't try to replay those either. So if you
> restart standby, it will fail to start up if the WAL it needs can't be
> found in archive or by connecting to the master. That must be fixed.
I agree that this is a bigger problem. Since the standby always starts
walreceiver before replaying any WAL files in pg_xlog, walreceiver tries
to receive the WAL files following the REDO starting point even if they
have already been in pg_xlog. IOW, the same WAL files might be shipped
from the primary to the standby many times. This behavior is unsmart,
and should be addressed.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-03-18 15:29:45 | pgsql: Fix missing parentheses for current_query(), per bug #5378. |
Previous Message | Peter Eisentraut | 2010-03-18 13:23:57 | pgsql: Use data-type specific conversion functions also in plpy.execute |
From | Date | Subject | |
---|---|---|---|
Next Message | Tim Landscheidt | 2010-03-18 15:52:31 | [PATCH] Explain generate_subscripts() more clearly |
Previous Message | Magnus Hagander | 2010-03-17 18:04:12 | Re: The type of ssl_renegotiation_limit |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-03-18 14:40:32 | Re: WIP: shared ispell dictionary |
Previous Message | Pavel Stehule | 2010-03-18 12:06:04 | Re: WIP: shared ispell dictionary |