From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bug in walreceiver |
Date: | 2011-01-13 08:59:15 |
Message-ID: | 4D2EBEE3.2010709@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 13.01.2011 10:28, Fujii Masao wrote:
> When the master shuts down or crashes, there seems to be
> the case where walreceiver exits without flushing WAL which
> has already been written. This might lead startup process to
> replay un-flushed WAL and break a Write-Ahead-Logging rule.
Hmm, that can happen at a crash even with no replication involved. If
you "kill -9 postmaster", and some WAL had been written but not fsync'd,
on crash recovery we will happily recover the unsynced WAL. We could
prevent that by fsyncing all WAL before applying it - presumably
fsyncing a file that has already been flushed is quick. But is it worth
the trouble?
> walreceiver.c
>> /* Wait a while for data to arrive */
>> if (walrcv_receive(NAPTIME_PER_CYCLE,&type,&buf,&len))
>> {
>> /* Accept the received data, and process it */
>> XLogWalRcvProcessMsg(type, buf, len);
>>
>> /* Receive any more data we can without sleeping */
>> while (walrcv_receive(0,&type,&buf,&len))
>> XLogWalRcvProcessMsg(type, buf, len);
>>
>> /*
>> * If we've written some records, flush them to disk and let the
>> * startup process know about them.
>> */
>> XLogWalRcvFlush();
>> }
>
> The problematic case happens when the latter walrcv_receive
> emits ERROR. In this case, the WAL received by the former
> walrcv_receive is not guaranteed to have been flushed yet.
>
> The attached patch ensures that all WAL received is flushed to
> disk before walreceiver exits. This patch should be backported
> to 9.0, I think.
Yeah, we probably should do that, even though it doesn't completely
close the window tahat unsynced WAL is replayed.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Joel Jacobson | 2011-01-13 09:31:06 | Bug in pg_dump |
Previous Message | Fujii Masao | 2011-01-13 08:28:50 | Bug in walreceiver |