Re: Support for NSS as a libpq TLS backend

From: Jacob Champion <pchampion(at)vmware(dot)com>
To: "sfrost(at)snowman(dot)net" <sfrost(at)snowman(dot)net>
Cc: "daniel(at)yesql(dot)se" <daniel(at)yesql(dot)se>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "andrew(dot)dunstan(at)2ndquadrant(dot)com" <andrew(dot)dunstan(at)2ndquadrant(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject: Re: Support for NSS as a libpq TLS backend
Date: 2021-03-31 22:15:15
Message-ID: c8d4bc0dfd266799ab4213f1673a813786ac0c70.camel@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2021-03-26 at 18:05 -0400, Stephen Frost wrote:
> * Jacob Champion (pchampion(at)vmware(dot)com) wrote:
> > Yeah. I was hoping to avoid implementing our own locks and refcounts,
> > but it seems like it's going to be required.
>
> Yeah, afraid so.

I think it gets worse, after having debugged some confusing crashes.
There's already been a discussion on PR_Init upthread a bit:

> Once we settle on a version we can confirm if PR_Init is/isn't needed and
> remove all traces of it if not.

What the NSPR documentation omits is that implicit initialization is
not threadsafe. So NSS_InitContext() is technically "threadsafe"
because it's built on PR_CallOnce(), but if you haven't called
PR_Init() yet, multiple simultaneous PR_CallOnce() calls can crash into
each other.

So, fine. We just add our own locks around NSS_InitContext() (or around
a single call to PR_Init()). Well, the first thread to win and
successfully initialize NSPR gets marked as the "primordial" thread
using thread-local state. And it gets a pthread destructor that does...
something. So lazy initialization seems a bit dangerous regardless of
whether or not we add locks, but I can't really prove whether it's
dangerous or not in practice.

I do know that only the primordial thread is allowed to call
PR_Cleanup(), and of course we wouldn't be able to control which thread
does what for libpq clients. I don't know what other assumptions are
made about the primordial thread, or if there are any platform-specific
behaviors with older versions of NSPR that we'd need to worry about. It
used to be that the primordial thread was not allowed to exit before
any other threads, but that restriction was lifted at some point [1].

I think we're going to need some analogue to PQinitOpenSSL() to help
client applications cut through the mess, but I'm not sure what it
should look like, or how we would maintain any sort of API
compatibility between the two flavors. And does libpq already have some
notion of a "main thread" that I'm missing?

--Jacob

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=294955

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-31 22:19:46 Re: libpq debug log
Previous Message 'alvherre@alvh.no-ip.org' 2021-03-31 22:14:28 Re: libpq debug log