From: | Robbie Harwood <rharwood(at)redhat(dot)com> |
---|---|
To: | David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Subject: | Re: [PATCH v6] GSSAPI encryption support |
Date: | 2016-03-14 23:20:47 |
Message-ID: | jlg8u1kprxs.fsf@thriss.redhat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
David Steele <david(at)pgmasters(dot)net> writes:
> On 3/14/16 4:10 PM, Robbie Harwood wrote:
>
>> David Steele <david(at)pgmasters(dot)net> writes:
>>
>>> On 3/8/16 5:44 PM, Robbie Harwood wrote:
>>>
>>>> Here's yet another version of GSSAPI encryption support. It's also
>>>> available for viewing on my github:
>>>
>>> psql simply hangs and never returns. I have attached a pcap of the
>>> psql/postgres session generated with:
>>
>> Please disregard my other email. I think I've found the issue; will
>> post a new version in a moment.
>
> Strange timing since I was just testing this. Here's what I got:
>
> $ pg/bin/psql -h localhost -U vagrant(at)PGMASTERS(dot)NET postgres
> conn->inStart = 179, conn->inEnd = 179, conn->inCursor = 179
> psql (9.6devel)
> Type "help" for help.
>
> postgres=>
Thanks, that certainly is interesting! I did finally manage to
reproduce the issue on my end, but the rate of incidence is much lower
than what you and Michael were seeing: I have to run connections in a
loop for about 10-20 minutes before it makes itself apparent (and no,
it's not due to entropy). Apparently I just wasn't patient enough.
> This was after commenting out:
>
> // appendBinaryPQExpBuffer(&conn->gwritebuf,
> // conn->inBuffer + conn->inStart,
> // conn->inEnd - conn->inStart);
> // conn->inEnd = conn->inStart;
>
> The good news I can log on every time now!
Since conn->inStart == conn->inEnd in the case you were testing, the
lines you commented out would have been a no-op anyway (that's the
normal case of operation, as far as I can tell). That said, the chances
of hitting the race for me seemed very dependent on how much code wants
to run in that conditional: I got it up to 30-40 minutes when I added a
lot of printf()s (can't just run in gdb because it's nondeterministic
and rr has flushing bugs at the moment).
All that is to say: thank you very much for investigating that!
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2016-03-14 23:28:44 | Re: Timeline following for logical slots |
Previous Message | Peter Geoghegan | 2016-03-14 23:11:48 | Re: Fix for OpenSSL error queue bug |