From: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PoC] Federated Authn/z with OAUTHBEARER |
Date: | 2025-03-06 20:57:24 |
Message-ID: | CAOYmi+n4EDOOUL27_OqYT2-F2rS6S+3mK-ppWb2Ec92UEoUbYA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 4, 2025 at 2:44 PM Jacob Champion
<jacob(dot)champion(at)enterprisedb(dot)com> wrote:
> Maybe. My first attempt gets all the BSDs green except macOS -- which
> now fails in a completely different test, haha... -_-
Small update: there is not one bug, but three that interact. ಠ_ಠ
1) The test server advertises an issuer of `https://localhost:<port>`,
but it doesn't listen on all localhost interfaces. When Curl tries to
contact the issuer on IPv6, its Happy Eyeballs handling usually falls
back to IPv4 after discovering that IPv6 is nonfunctional, but
occasionally it contacts something that was temporarily listening
there instead.
Since I don't really want to write a bunch of IPv6 fallback code for
the test server -- this should be testing OAuth, not finding all the
ways that buildfarm OSes can expose dual stack sockets -- I changed
the issuer to be IPv4-only. When I did this, the interval timing tests
immediately failed on macOS.
2) macOS's EVFILT_TIMER implementation seems to be different from the
other BSDs. On Mac, when you re-add a timer to a kqueue, any existing
timer-fired events for it are not cleared out and the kqueue might
remain readable. This breaks a postcondition of our set_timer()
function, which is that new timeouts are supposed to completely
replace previous timeouts.
With a dual stack issuer, the Happy Eyeballs timeouts would be
routinely cleared out by libcurl, setting up a clean slate for the
next call to set_timer(). But with an IPv4-only issuer, libcurl didn't
need to clear out the timeouts (they'd already fired), which meant
that our call to set the ping interval was ineffective.
3) There is a related performance bug on other platforms. If a Curl
timeout happens partway through a request (so libcurl won't clear it),
the timer-expired event will stay set and CPU will be burned to spin
pointlessly on drive_request(). This is much easier to notice after
taking Happy Eyeballs out of the picture. It doesn't cause logical
failures -- Curl basically discards the unnecessary calls -- but it's
definitely unintended.
--
Problem 1 is a simple patch. I am working on a fix for Problem 2, but
I got stuck trying to get a "perfect" solution working yesterday...
Since this is a partial(?) blocker for getting NetBSD going, I'm going
to pivot to an ugly-but-simple approach today.
I plan to defer working on Problem 3, which should just be a
performance bug, until the tests are green again. And I would like to
eventually add some stronger unit tests for the timer behavior, to
catch other potential OS-specific problems in the future.
Thanks,
--Jacob
From | Date | Subject | |
---|---|---|---|
Next Message | Nikhil Kumar Veldanda | 2025-03-06 20:59:01 | Re: ZStandard (with dictionaries) compression support for TOAST compression |
Previous Message | Tom Lane | 2025-03-06 20:56:46 | Re: Add column name to error description |