From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net> |
Subject: | walsender termination error messages worse in v10 |
Date: | 2017-06-02 18:51:26 |
Message-ID: | 20170602185126.xzlcoublmaduixcy@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920
Author: Peter Eisentraut <peter_e(at)gmx(dot)net>
Date: 2017-03-23 08:36:36 -0400
Logical replication support for initial data copy
made walreceiver emit worse messages in v10 than before when the master
gets shut down. Before 10 you'll get something like:
2017-06-02 11:46:07.173 PDT [16143][] LOG: replication terminated by primary server
2017-06-02 11:46:07.173 PDT [16143][] DETAIL: End of WAL reached on timeline 1 at 0/1B7FB8C8.
2017-06-02 11:46:07.173 PDT [16143][] FATAL: could not send end-of-streaming message to primary: no COPY in progress
the last bit is a bit superflous, but it still kinda makes sense. Now
you get:
2017-06-02 11:47:09.362 PDT [17061][] FATAL: unexpected result after CommandComplete: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
the reason is that
static int
libpqrcv_receive(char **buffer, pgsocket *wait_fd)
{
previously did:
if (rawlen == -1) /* end-of-streaming or error */
{
PGresult *res;
res = PQgetResult(streamConn);
if (PQresultStatus(res) == PGRES_COMMAND_OK ||
PQresultStatus(res) == PGRES_COPY_IN)
{
PQclear(res);
return -1;
}
else
{
PQclear(res);
ereport(ERROR,
(errmsg("could not receive data from WAL stream: %s",
PQerrorMessage(streamConn))));
}
}
and now does
if (rawlen == -1) /* end-of-streaming or error */
{
PGresult *res;
res = PQgetResult(conn->streamConn);
if (PQresultStatus(res) == PGRES_COMMAND_OK)
{
PQclear(res);
/* Verify that there are no more results */
res = PQgetResult(conn->streamConn);
if (res != NULL)
ereport(ERROR,
(errmsg("unexpected result after CommandComplete: %s",
PQerrorMessage(conn->streamConn))));
return -1;
}
else if (PQresultStatus(res) == PGRES_COPY_IN)
{
PQclear(res);
return -1;
}
else
{
PQclear(res);
ereport(ERROR,
(errmsg("could not receive data from WAL stream: %s",
pchomp(PQerrorMessage(conn->streamConn)))));
}
}
note the new /* Verify that there are no more results */ bit.
I don't understand why the new block is there, nor does the commit
message explain it.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2017-06-02 18:52:36 | Re: Get stuck when dropping a subscription during synchronizing table |
Previous Message | Tom Lane | 2017-06-02 18:33:09 | Re: Perfomance bug in v10 |