From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Sameer Thakur <samthakur74(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, sthomas(at)optionshouse(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Inconsistent DB data in Streaming Replication |
Date: | 2013-04-15 07:32:03 |
Message-ID: | CBA631C1-F678-4335-ACCD-1E30846F732C@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Apr14, 2013, at 17:56 , Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> At fast shutdown, after walsender sends the checkpoint record and
> closes the replication connection, walreceiver can detect the close
> of connection before receiving all WAL records. This means that,
> even if walsender sends all WAL records, walreceiver cannot always
> receive all of them.
That sounds like a bug in walreceiver to me.
The following code in walreceiver's main loop looks suspicious:
/*
* Process the received data, and any subsequent data we
* can read without blocking.
*/
for (;;)
{
if (len > 0)
{
/* Something was received from master, so reset timeout */
...
XLogWalRcvProcessMsg(buf[0], &buf[1], len - 1);
}
else if (len == 0)
break;
else if (len < 0)
{
ereport(LOG,
(errmsg("replication terminated by primary server"),
errdetail("End of WAL reached on timeline %u at %X/%X",
startpointTLI,
(uint32) (LogstreamResult.Write >> 32),
(uint32) LogstreamResult.Write)));
...
}
len = walrcv_receive(0, &buf);
}
/* Let the master know that we received some data. */
XLogWalRcvSendReply(false, false);
/*
* If we've written some records, flush them to disk and
* let the startup process and primary server know about
* them.
*/
XLogWalRcvFlush(false);
The loop at the top looks fine - it specifically avoids throwing
an error on EOF. But the code then proceeds to XLogWalRcvSendReply()
which doesn't seem to have the same smarts - it simply does
if (PQputCopyData(streamConn, buffer, nbytes) <= 0 ||
PQflush(streamConn))
ereport(ERROR,
(errmsg("could not send data to WAL stream: %s",
PQerrorMessage(streamConn))));
Unless I'm missing something, that certainly seems to explain
how a standby can lag behind even after a controlled shutdown of
the master.
best regards,
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2013-04-15 13:53:41 | Re: WIP: index support for regexp search |
Previous Message | Tom Lane | 2013-04-15 03:24:20 | Re: add sha256 files to releases |