From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | could not receive data from WAL stream: SSL SYSCALL error: Success |
Date: | 2017-11-15 10:46:43 |
Message-ID: | CAEepm=3cc5wYv=X4Nzy7VOUkdHBiJs9bpLzqtqJWxdDUp5DiPQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
I heard a report of an error like this from a user of openssl
1.1.0f-3+deb9u on Debian:
pg_basebackup: could not receive data from WAL stream: SSL SYSCALL
error: Success
I noticed that some man pages for SSL_get_error say this under
SSL_ERROR_SYSCALL:
Some non-recoverable I/O error occurred. The OpenSSL error queue
may contain more information on the error. For socket I/O on Unix
systems, consult errno for details.
But others say:
Some I/O error occurred. The OpenSSL error queue may contain more
information on the error. If the error queue is empty (i.e. ERR_get_error()
returns 0), ret can be used to find out more about the error: If ret == 0,
an EOF was observed that violates the protocol. If ret == -1, the underlying
BIO reported an I/O error (for socket I/O on Unix systems, consult errno for
details).
While wondering if it was the documentation or the behaviour that
changed and what it all means, I came across some discussion and a
reverted commit here:
https://github.com/openssl/openssl/issues/1903
The error reported to me seems to have occurred on a release whose man
page *doesn't* describe the ERR_get_error() == 0 case (unlike some of
the earlier tags you can get to from here):
https://github.com/openssl/openssl/blob/OpenSSL_1_1_0-stable/doc/ssl/SSL_get_error.pod
And yet clearly errno didn't hold an error number from a failed
syscall, which seems consistent with the older documented behaviour.
Perhaps pgtls_read(), pgtls_write() and open_client_SSL() should add
"&& ecode != 0" to the if statements in their SSL_ERROR_SYSCALL case
so that this case would fall to the "EOF detected" message instead of
logging the nonsensical (and potentially uninitialised?) errno
message, if indeed this is behaviour described in older releases. On
the other hand, without documentation to support it in the current
release, we don't really *know* that it's an EOF condition. Due to
this murkiness and the fact that it's mostly harmless anyway, I'm not
proposing a change, but I thought I'd share this in case it makes more
sense to someone more familiar with this stuff.
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Huong Dangminh | 2017-11-15 10:55:39 | RE: User defined data types in Logical Replication |
Previous Message | Andreas Joseph Krogh | 2017-11-15 10:45:40 | Sv: pspg - psql pager |