Re: [PoC] Federated Authn/z with OAUTHBEARER

From: Andrey Chudnovsky <achudnovskij(at)gmail(dot)com>
To: Jacob Champion <jchampion(at)timescale(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, mahendrakar s <mahendrakarforpg(at)gmail(dot)com>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "smilingsamay(at)gmail(dot)com" <smilingsamay(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER
Date: 2023-07-07 21:16:05
Message-ID: CACrwV57ik2NSdUO77HEa828y4zceGdufJxyT5ASHK4-iiJYV7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Jacob for making progress on this.

> 3) Is the current conn->async_auth() entry point sufficient for an
> application to implement the Microsoft flows discussed upthread?

Please confirm my understanding of the flow is correct:
1. Client calls PQconnectStart.
- The client doesn't know yet what is the issuer and the scope.
- Parameters are strings, so callback is not provided yet.
2. Client gets PgConn from PQconnectStart return value and updates
conn->async_auth to its own callback.
3. Client polls PQconnectPoll and checks conn->sasl_state until the
value is SASL_ASYNC
4. Client accesses conn->oauth_issuer and conn->oauth_scope and uses
those info to trigger the token flow.
5. Expectations on async_auth:
a. It returns PGRES_POLLING_READING while token acquisition is going on
b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
when token acquisition succeeds.
6. Is the client supposed to do anything with the altsock parameter?

Is the above accurate understanding?

If yes, it looks workable with a couple of improvements I think would be nice:
1. Currently, oauth_exchange function sets conn->async_auth =
pg_fe_run_oauth_flow and starts Device Code flow automatically when
receiving challenge and metadata from the server.
There probably should be a way for the client to prevent default
Device Code flow from triggering.
2. The current signature and expectations from async_auth function
seems to be tightly coupled with the internal implementation:
- Pieces of information need to be picked and updated in different
places in the PgConn structure.
- Function is expected to return PostgresPollingStatusType which
is used to communicate internal state to the client.
Would it make sense to separate the internal callback used to
communicate with Device Code flow from client facing API?
I.e. introduce a new client facing structure and enum to facilitate
callback and its return value.

-----------
On a separate note:
The backend code currently spawns an external command for token validation.
As we discussed before, an extension hook would be a more efficient
extensibility option.
We see clients make 10k+ connections using OAuth tokens per minute to
our service, and stating external processes would be too much overhead
here.

-----------

> 5) Does this maintenance tradeoff (full control over the client vs. a
> large amount of RFC-governed code) seem like it could be okay?

It's nice for psql to have Device Code flow. Can be made even more
convenient with refresh tokens support.
And for clients on resource constrained devices to be able to
authenticate with Client Credentials (app secret) without bringing
more dependencies.

In most other cases, upstream PostgreSQL drivers written in higher
level languages have libraries / abstractions to implement OAUTH flows
for the platforms they support.

On Fri, Jul 7, 2023 at 11:48 AM Jacob Champion <jchampion(at)timescale(dot)com> wrote:
>
> On Thu, Jul 6, 2023 at 1:48 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion(at)timescale(dot)com> wrote:
> > > Something analogous to libcurl's socket and timeout callbacks [1],
> > > then? Or is there an existing libpq API you were thinking about using?
> >
> > Yeah. Libpq already has an event concept.
>
> Thanks -- I don't know how I never noticed libpq-events.h before.
>
> Per-connection events (or callbacks) might bring up the same
> chicken-and-egg situation discussed above, with the notice hook. We'll
> be fine as long as PQconnectStart is guaranteed to return before the
> PQconnectPoll engine gets to authentication, and it looks like that's
> true with today's implementation, which returns pessimistically at
> several points instead of just trying to continue the exchange. But I
> don't know if that's intended as a guarantee for the future. At the
> very least we would have to pin that implementation detail.
>
> > > > Or, more likely in the
> > > > first version, you just can't do it at all... Doesn't seem that bad
> > > > to me.
> > >
> > > Any initial opinions on whether it's worse or better than a worker thread?
> >
> > My vote is that it's perfectly fine to make a new feature that only
> > works on some OSes. If/when someone wants to work on getting it going
> > on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
> > OSes we target), they can write the patch.
>
> Okay. I'm curious to hear others' thoughts on that, too, if anyone's lurking.
>
> Thanks!
> --Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-07-07 22:13:21 Re: check_strxfrm_bug()
Previous Message Tristan Partin 2023-07-07 20:52:36 Re: check_strxfrm_bug()