From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Rare SSL failures on eelpout |
Date: | 2019-03-04 21:08:02 |
Message-ID: | 26030.1551733682@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
>> That suggests that we could perhaps handle ECONNRESET both at startup
>> packet send time (for certificate rejection, eelpout's case) and at
>> initial query send (for idle timeout, bug #15598's case) by attempting
>> to read. Does that make sense?
> Hmm ... it definitely makes sense that we shouldn't assume that a *write*
> failure means there is nothing left to *read*.
After staring at the code for awhile, I am thinking that there may be
a bug of that ilk, but if so it's down inside OpenSSL. Perhaps it's
specific to the OpenSSL version you're using on eelpout? There is
not anything we could do differently in libpq, AFAICS, because it's
OpenSSL's responsibility to read any data that might be available.
I also looked into the idea that we're doing something wrong on the
server side, allowing the final error message to not get flushed out.
A plausible theory there is that SSL_shutdown is returning a WANT_READ
or WANT_WRITE error and we should retry it ... but that doesn't square
with your observation upthread that it's returning SSL_ERROR_SSL.
It's all very confusing, but I think there's a nontrivial chance
that this is an OpenSSL bug, especially since we haven't been able
to replicate it elsewhere.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-03-04 21:13:41 | Re: Allowing extensions to supply operator-/function-specific info |
Previous Message | Alvaro Herrera | 2019-03-04 20:46:07 | Re: monitoring CREATE INDEX [CONCURRENTLY] |