Re: GSSENC'ed connection stalls while reconnection attempts.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: GSSENC'ed connection stalls while reconnection attempts.
Date: 2020-07-13 05:35:13
Message-ID: 20200713.143513.1305817321981582482.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 10 Jul 2020 12:01:10 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> > If psql connected using GSSAPI auth and server restarted, reconnection
> > sequence stalls and won't return.
>
> Yeah, reproduced here. (I wonder if there's any reasonable way to
> exercise this scenario in src/test/kerberos/.)
>
> > I found that psql(libpq) sends startup packet via gss
> > encryption. conn->gssenc should be reset when encryption state is
> > freed.
>
> Actually, it looks to me like the GSS support was wedged in by somebody
> who was paying no attention to how SSL is managed, or else we forgot
> to pay attention to GSS the last time we rearranged SSL support. It's
> completely broken for the multiple-host-addresses scenario as well,
> because try_gss is being set and cleared in the wrong places altogether.
> conn->gcred is not being handled correctly either I think --- we need
> to make sure that it's dropped in pqDropConnection.
>
> The attached patch makes this all act more like the way SSL is handled,
> and for me it resolves the reconnection problem.

It looks good to me.

> > The reason that psql doesn't notice the error is pqPacketSend returns
> > STATUS_OK when write error occurred. That behavior contradicts to the
> > comment of the function. The function is used only while making
> > connection so it's ok to error out immediately by write failure. I
> > think other usage of pqFlush while making a connection should report
> > write failure the same way.
>
> I'm disinclined to mess with that, because (a) I don't think it's the
> actual source of the problem, and (b) it would affect way more than
> just GSS mode.

If I did that in pqFlush your objection would be right, but
pqPacketSend is defined as "RETURNS: STATUS_ERROR if the write fails"
so not doing that is just wrong. (pqSendSome reported write failure in
this case.) For other parts in authentication code, I don't think it
doesn't affect badly because authentication should proceed without any
read/write overlapping.

> > Finally, It's user-friendly if psql shows error message for error on
> > reset attempts. (This perhaps should be arguable.)
>
> Meh, that's changing fairly longstanding behavior that I don't think
> we've had many complaints about.

Yeah, I haven't seen the message for any other reasons than the
absence of server. :p And, I noticed that, in the first place, I would
see that message in the next connection attempt from scratch.

I agree to you on this point.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-07-13 05:39:54 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Dilip Kumar 2020-07-13 05:17:26 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions