Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Sehrope Sarkuni <sehrope(at)jackdb(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-08-23 14:35:17
Message-ID: 20190823143517.GW16436@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Fri, Aug 23, 2019 at 07:45:22AM -0400, Stephen Frost wrote:
> > Having listed out the feature set of each of the other major databases
> > when it comes to TDE is exactly how we objectively look at what is being
> > done in the industry, and that then gives us an understanding of what
> > users (and auditors) coming from other platforms will expect.
> >
> > I entirely agree that we shouldn't just copy N feature from X other
> > database system unless we feel that's the best approach, but when every
> > other database system out there has capability Y for the general feature
> > X that we're thinking about implementing, we should be questioning an
> > approach which doesn't include that.
>
> Agreed. The features of other databases are a clear source for what we
> should consider and run through the useful/reasonable filter.

Following on from that- when other databases don't have something that
we're thinking about implementing, maybe we should be contemplating if
it really makes sense as a requirement for us.

Specifically in this case- I went back and tried to figure out what
other database systems have an "encrypt EVERYTHING" option. I didn't
have much luck finding one though. So I think we need to ask ourselves-
the "check box" that we're trying to check off with TDE, do the other
database system check that box? If so, then it looks like the "check
box" isn't actually "encrypt EVERYTHING", it's more along the lines of
"make sure all regular user data is encrypted automatically" or some
such, and that's a very different requirement, which seems to be
answered by the other systems by having a KMS + tablespace/database
level encryption. We certainly shouldn't be putting a lot of effort
into building something that is either overkill or won't be interesting
to users due to limitations like "have to take the entire cluster
offline to re-key it".

Now, that KMS has to be encrypted using a master key, of course, and we
have to make sure that it is able to survive across a crash, and it'd
sure be nice if it was indexed. One option for such a KMS would be
something entirely external (which could potentially just be another PG
database or something) but it'd be nice if we had something built-in.
We might also want it to be replicated (or maybe we don't, as was
discussed on the call, to allow for a replica to use an independent set
of keys- of course that leads to issues with pg_rewind and such though).

Anything built-in does seem like it'd be a fair bit of work to get it to
address those requirements, but that does seem to be what the other
database systems have done. Unfortunately, their documentation doesn't
seem to really say exactly what they've done to address that.

A couple random ideas that probably won't work, but I'll put them out
there for others to shoot down-

Some kind of 2-phase WAL pass, where we do WAL replay for the
non-encrypted bits first (which would include the KMS) and then go back
and WAL replay the encrypted stuff. Seems terrible.

An independent WAL for the KMS only. Ugh, do we need another walwriter
then? and buffers, and lots of other stuff.

Some kind of flat-file based approach with a temp file and renaming of
files using durable_rename(), like what we used to do with
pg_shadow/authid, and now do with replorigin_checkpoint and such?

Something else?

Thoughts?

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-08-23 15:19:00 Re: XPRS
Previous Message Konstantin Knizhnik 2019-08-23 14:32:02 Re: Why overhead of SPI is so large?