Re: [PoC] Federated Authn/z with OAUTHBEARER

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER
Date: 2025-02-13 22:56:29
Message-ID: 83C44AB4-24B0-437F-B139-B5CBC5821BB1@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 8 Feb 2025, at 02:56, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> wrote:

Thanks for the new version!

> - 0004 gets a missed pgperltidy and explicitly skips unsupported tests
> on Windows.

+if ($Config{osname} eq 'MSWin32')
We can get away with nog importing Config at all since Test::Utils export a
symbol for this already in $windows_os. Fixed in my attached.

> Daniel and I talked at FOSDEM about wanting to have additional
> guardrails on the server-side validator API. Ideally, we'd wait for
> major version boundaries to change APIs, as per usual. But if any bugs
> come to light that affect the security of the system, we may want to
> have more control over the boundary between the server and the
> validator. So I've added two features to the API.

I think what you added there is quite sufficient for handling the worst case
that ideally should never happen. Even though I can't see us breaking this
given the code being trivial, I also don't feel like realizing after the fact
when we need it that it was subtly broken, so I added a test validator which
use the wrong ABI version.

I have now read the entire patch cover-to-cover twice to try and catch any
rough or sharp edges. Unsurprisingly given the number of revisions this patch
has gone through, and the number of hours that have been put into it, there
isn't much to be found. Most of my findings below are well and truly in the
nit- pickery category (and my favourite, paranoia-induced defensive
programming). There are no architectural flaws that I can detect, and cross-
referencing with the RFC's I don't see anything mixed up in spec compliance.

To make it easier for you to see what I mean I have implemented most of the
comments and attached as a fixup patch, from which you can cherry-pick hunks
you agree with. Those I didn't implement should be marked as such below.

As we discussed off-list I took the liberty of squashing the previous fixup
patches into a single one, and squashed your fixes for my comments against v47
into 0001. All of my proposals are in 0004.

Some comments:

+ The system which hosts the protected resources which are
The repetition of "which .. which" reads a bit off to me, I propose to
simplify as "The system hosting then.." instead.

+ The organization, product vendor, or other entity which develops and/or
+ administers the OAuth servers and clients for a given application.
Since we define terminology here, shouldn't this be "OAuth resource servers"?

+ <productname>PostgreSQL</productname> does not provide an authorization
+ server; it's obtained from the OAuth provider.
The "obtained from" part makes it sound like you need to get some server
software to run this with PostgreSQL. How about "; it is the responsibility
of.."?

+ <xref linkend="libpq-connect-oauth-issuer"/> setting. No variations in
+ case or format are permitted.
While not wrong in any way, I think it would be clearer to write "formatting"
here since that's really what we are talking about no?

+ Build with libcurl support for OAuth 2.0 client flows.
+ This requires the <productname>curl</productname> package to be
+ installed. Building with this will check for the required header files
I don't think we need to document that curl need to be installed since it's
likely already on the system of anyone reading this. I do however think we
should state the minimum required version.

+ setting in <link linkend="auth-oauth">the server's HBA configuration.</link>
Nitpickery: I prefer to have the period outside the <link> markup.

+ the connection will fail. This is required to prevent a class of "mix-up
+ attacks" on OAuth clients.
Since mix-up attacks aren't very well documented I think we should aid the
readers by linking to the OAuth WG announcement of this class of attacks. From
there readers can find the original paper but I think linking directly to that
is less helpful than the mailinglist post.

+ running a client via SSH. Client applications may implement their own flows
For consistency with the rest of the docs we should wrap SSH with an
<application> tag.

+ You will then log into your OAuth provider, which will ask whether you want
A think third person here reads more like the rest of the docs.

+ which <application>libpq</application> will call when when action is
"when when", I think this should really be "when an"?

+ sprays HTTP traffic (containing several critical secrets) to standard
+ error during the OAuth flow
This might not be readily understandable by non-native speakers.

+ Similarly, if you are using Curl inside your application,
Should use <productname> markup.

+ more recent versions of Curl that are built to support threadsafe
s/threadsafe/thread-safe/g for documentation consistency (ditto in other places
where used as a adjective and not code identifier).

+ itself; validator modules provide the glue between the server and the OAuth
s/glue/integration layer/ to avoid confusing readers not used to english idioms.

+ Since a misbehaving validator might let unauthorized users into the database,
+ correct implementation is critical. See
"Don't make any bugs" isn't very helpful advice =) Expanded on it slightly.

+ An OAuth validator module is loaded by dynamically loading one of the shared
The double use of load in "loaded .. loading", rewording to try and simplify.

+ The server has ensured that the token is well-formed syntactically, but no
"server" is an overloaded nomenclature here, perhaps using libpq instead to
clearly indicate that it's postgres and not an OAuth server.

+ The connection will only proceed if the module sets
+ <structfield>authorized</structfield> to <literal>true</literal>. To
In other places we use the structure name as well and not just the member,
adding that here to be consistent.

+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
Off-by-one

+ * - RFC 7628: https://tools.ietf.org/html/rfc7628
Replacing all tools.ietf.org mentions with datatracker.ietf.org to save a few
redirects.

+ p++;
+ if (*p != ',')
If the SASL exchange, are we certain that a rogue client cannot inject a
message which trips us past the end of string? Should we be doublecheck when
advancing p across the message?

+sanitize_char(char c)
+{
+ static char buf[5];
With the multithreading work on the horizon we should probably avoid static
variables like these to not create work for our future selves? The code isn't
as neat when passing in a buffer/length but it avoids the need for a static or
threadlocal variable. Or am I overthinking this?

+ initStringInfo(&issuer);
+ appendStringInfoString(&issuer, ctx->issuer);
The double StringInfoData variables in generate_error_response to be able to
JSON escape the issuer is a bit of an eye-sore, a version of
escape_json_with_len which also took an offset into the buffer on where to
start maybe? Nothing that is urgent to address now (and I have not changed
anything here), but I'll keep it at the back of my head.

+ ereport(LOG, errmsg("internal error in OAuth validator module"));
I wonder if this should be using WARNING instead? It's really something that
should trigger alarm bells going off. I've also added an errcode for easier
fleet analysis.

In load_validator_library we don't explicitly verify that the required callback
is defined in the returned structure, which seems like a cheap enough belts and
suspenders level check.

+ if (parsed < 1)
+ return actx->debugging ? 0 : 1;
Is 1 second a sane lower bound on interval for all situations? I'm starting to
wonder if we should be more conservative here, or even make it configurable in
some way? The default if not set of 5 seconds is quite a lot higher than 1.

+ if (INT_MAX <= parsed)
I think it's closer to project to style to keep the variable on the left side
in such comparisons, so changed these.

+ parsed = parse_json_number(expires_in_str);
+ parsed = round(parsed);
Shouldn't we floor() the value here to ensure we never report an expiration
time longer than the actual expiration?

+ * Some services (Google, Azure) spell verification_uri differently.
I did another round of documentation reading and couldn't find any provider
which also use "verification_url_complete". However, since it looks so similar
to verification_url it seems worthwhile to add a comment to save readers from
the same rabbithole.

register_socket() doesn't have an error catch for the case when neither epoll
nor kqeue is supported. Shouldn't it set actx_error() here as well? (Not done
in my review patch.)

+ if (actx->curl_err[0])
+ {
+ size_t len;
+
+ appendPQExpBuffer(&conn->errorMessage, " (%s)", actx->curl_err);
Should this also qualify that the error comes from outside of postgres?
Something like "(libcurl:%s)" to match? I haven't changed this in the attached
since I'm still on the fence, but I'm leanings that we probably should.
Thoughts?

- * We only support one mechanism at the moment, so rather than deal with a
+ * We only support two mechanisms at the moment, so rather than deal with a
While there's nothing incorrect about this comment, I have a feeling we won't
support more mechanisms than we can justify having a simple array for anytime
soon =)

Sorry for the wall of text.

In general, I feel that this is getting very close to its final form wrt being
a committable patch, and assuming we don't find anything structurally unsound
in the coming days I don't see a blocker for getting this into v18 before the
final commitfest. If anyone disagrees with this I'd love for that be brought
up.

--
Daniel Gustafsson

Attachment Content-Type Size
v49-0001-Add-OAUTHBEARER-SASL-mechanism.patch application/octet-stream 313.1 KB
v49-0002-v48-fixup-patches-Add-OAUTHBEARER-SASL-mechanism.patch application/octet-stream 17.6 KB
v49-0003-XXX-fix-libcurl-link-error.patch application/octet-stream 1.2 KB
v49-0004-v49-Fixups-proposed-by-Daniel.patch application/octet-stream 37.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2025-02-13 23:01:07 Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message Thomas Munro 2025-02-13 22:53:47 Re: BitmapHeapScan streaming read user and prelim refactoring