Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-08-09 00:24:00
Message-ID: 20190809002400.rguhd76ghfwxpru7@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 8, 2019 at 03:07:59PM -0400, Stephen Frost wrote:
> Greetings,
>
> * Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> > On Tue, Jul 9, 2019 at 11:09:01AM -0400, Bruce Momjian wrote:
> > > On Tue, Jul 9, 2019 at 10:59:12AM -0400, Stephen Frost wrote:
> > > > * Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> > > > I agree that all of that isn't necessary for an initial implementation,
> > > > I was rather trying to lay out how we could improve on this in the
> > > > future and why having the keying done at a tablespace level makes sense
> > > > initially because we can then potentially move forward with further
> > > > segregation to improve the situation. I do believe it's also useful in
> > > > its own right, to be clear, just not as nice since a compromised backend
> > > > could still get access to data in shared buffers that it really
> > > > shouldn't be able to, even broadly, see.
> > >
> > > I think TDE is feature of questionable value at best and the idea that
> > > we would fundmentally change the internals of Postgres to add more
> > > features to it seems very unlikely. I realize we have to discuss it so
> > > we don't block reasonable future feature development.
> >
> > I have a new crazy idea. I know we concluded that allowing multiple
> > independent keys, e.g., per user, per table, didn't make sense since
> > they have to be unlocked all the time, e.g., for crash recovery and
> > vacuum freeze.
>
> I'm a bit confused as I never agreed that made any sense and I continue
> to feel that it doesn't make sense to have one key for everything.

I clearly explained why multiple keys, while desirable, have many
negatives. If you want to address my replies, we can go over them
again. What people want, and what we can reasonably accomplish, are two
different things.

> Crash recovery doesn't happen "all the time" and neither does vacuum
> freeze, and autovacuum processes are independent of individual client
> backends- we don't need to (and shouldn't) have the keys in shared
> memory.

Uh, I just don't know what that would look like, honestly. I am trying
to get us toward something that is easily implemented and easy to
control.

> > However, that assumes that all heap/index pages are encrypted, and all
> > of WAL. What if we encrypted only the user-data part of the page, i.e.,
> > tuple data. We left xmin/xmax unencrypted, and only stored the
> > encrypted part of that data in WAL, and didn't encrypt any more of WAL.
>
> This is pretty much what Alvaro was suggesting a while ago, isn't it..?
> Have just the user data be encrypted in the table and in the WAL stream.

Well, I think he was saying that to reduce the overhead of encryption.
I didn't see it as a way of allowing recovery and vacuum freeze. My
exact reply was:

> Well, you would need to decide what WAL information needs to be secured.
> Is the fact an insert was performed on a table a security issue?
> Depends on your risks. My point is that almost anything you do beyond
> cluster-level encryption either adds complexity that is bug-prone or
> fragile, or adds unacceptable overhead, or leaks security information.

> > That might allow crash recovery and the freeze part of VACUUM FREEZE to
> > work. (I don't think we could vacuum since we couldn't read the index
> > pages to find the matching rows since the index values would be encrypted
> > too. We might be able to not encrypt the tid in the index typle.)
>
> Why do we need the indexed values to vacuum the index..? We don't
> today, as I recall. We would need the tids though, yes.

Uh, well, if we are doing index cleaning by doing a sequential scan of
the index, which I think we have done for many years, I think just
looking at the tids should work. However, I don't know if we ever
adjust index entries, like re-balance the trees.

> > Is this something considering in version one of this feature? Probably
> > not, but later? Never? Would the information leakage be too great,
> > particularly from indexes?
>
> What would be leaking from the indexes..? That an encrypted blob in the
> index pointed to a given tid? Wouldn't someone be able to see that same
> information by looking directly at the relation too?

Well, I assume we would encrypt the heap and its indexes. For example,
if there is an employee table, and there is an index on the employee
last name and employee salary, it would be trivial to get a list of
employee salaries sorted by last name by just joining the tids, though
you would not know the last names. That seems like an information leak
to me. Plus, which tables were updated would be visible in WAL. And we
would have issues with system tables, pg_statistics, and lots of other
complexity.

I can see value in eventually doing this, perhaps before we perform
cluster-wide encryption, but doing it without cluster-wide encryption
seems like it would leak too much information to be useful.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Ladhe 2019-08-09 00:37:14 Re: block-level incremental backup
Previous Message Jeff Davis 2019-08-08 22:38:20 Add "password_protocol" connection parameter to libpq