Re: Proposal for implementing OCSP Stapling in PostgreSQL

From: David Zhang <idrawone(at)gmail(dot)com>
To: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: Proposal for implementing OCSP Stapling in PostgreSQL
Date: 2024-07-17 22:41:55
Message-ID: e624b30e-0f97-4638-96ad-5a2ee7ce6553@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> = Design =
>
> It looks like this design relies on the DBA to manually prefetch OCSP
> responses for their cert chain, and cache them in the local
> ssl_ocsp_file. This is similar to Nginx's ssl_stapling_file directive
> [1]. I think this may make sense for a v1 (much less code!), but it's
> going to take a lot of good documentation on what, exactly, an admin
> has to do to make sure their response file is fresh. IIUC it will
> depend on the CA's policy and how they operate their responder.
>
> We should probably be prepared for complaints that we don't run the
> client ourselves. A v2 could maybe include an extension for running
> the OCSP client in a background worker? Or maybe that's overkill, and
> an existing job scheduler extension could do it, but having a custom
> worker that could periodically check the response file and provide
> warnings when it gets close to expiring might be helpful for admins.
Totally agree. Either Implementing OCSP requests over HTTP, then parsing
the response and then saving the results to a file, or using an OpenSSL
client with a cron job to periodically update the file should work.
Using a cron job would likely have less impact on PostgreSQL.
> I am worried about the multi-stapling that is being done in the tests,
> for example the server-ca-ocsp-good response file. I think those tests
> are cheating a bit. Yes, for this particular case, we can issue a
> single signed response for both the intermediate and the leaf -- but
> that's only because we issued both of them. What if your intermediate
> and your root use different responders [2]? What if your issuer's
> responder doesn't even support multiple certs per request [3]?
>
> (Note that OpenSSL still doesn't support multi-stapling [4], even in
> TLS 1.3 where we were supposed to get it "for free".)

Totally agree, then we should limit OCSP stapling check for the
leaf/PostgreSQL server certificate only in v1.

> I think it would be good for the sslocspstapling directive to 1) maybe
> have a shorter name (cue bikeshed!) and 2) support more than a binary
> on/off. For example, the current implementation could use
> "disable/require" options, and then a v2 could add "prefer" which
> simply sends the status request and honors must-staple extensions on
> the certificate. (That should be safe to default to, I think, and it
> would let DBAs control the stapling rollout more easily.)

This will definitely give end users more options, especially during the
transition period.

> A question from ignorance: how does the client decide that the
> signature on the OCSP response is actually valid for the specific
> chain in use?

If I understand correctly, the certificate used by the OCSP responder to
sign the OCSP response must be valid for the specific chain in use, or
the admins allow to load a new chain to validate the certificate used to
sign the OCSP response. I think it would be better to make this part to
be more open.

>
> = Tests =
>
> I think the tests should record the expected_stderr messages for
> failures (see the Code section below for why).
+1
> If it turns out that multi-stapling is safe, then IMO the tests should
> explicitly test both TLSv1.2 and v1.3, since those protocols differ in
> how they handle per-certificate status.
>
> If it's okay with you, I'd like to volunteer to refactor out the
> duplicated recipes in sslfiles.mk. I have some particular ideas in
> mind and don't want to make you play fetch-a-rock. (No pressure to use
> what I come up with, if you don't like it.)
That would be great, thanks a lot in advance!
> Because a CRL is a valid fallback for OCSP in practice, I think there
> should be some tests for their interaction. (For v1, maybe that's as
> simple as "you're not allowed to use both yet", but it should be
> explicit.)
+1
> = Code =
>
> (this is a very shallow review)
>
> +#define OCSP_CERT_STATUS_OK 1
> +#define OCSP_CERT_STATUS_NOK (-1)
>
> Returning f -1 from the callback indicates an internal error, so we're
> currently sending the wrong alerts for OCSP failures ("internal error"
> rather than "bad certificate status response") and getting unhelpful
> error messages from libpq. For cases where the handshake proceeds
> correctly but we don't accept the OCSP response status, I think we
> should be returning zero.
+1
> Also, we should not stomp on the OCSP_ namespace (I thought these
> macros were part of the official <openssl/ocsp.h> API at first).
+1
> + /* set up OCSP stapling callback */
> + SSL_CTX_set_tlsext_status_cb(SSL_context, ocsp_stapling_cb);
>
> It seems like this should only be done if ssl_ocsp_file is set?

+1

Thanks a lot for reviewing and providing all your feedback!

Best regards,

David Zhang

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-07-17 23:24:29 Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Previous Message David Rowley 2024-07-17 22:33:35 Add mention of execution time memory for enable_partitionwise_* GUCs