From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) |
Date: | 2019-06-12 18:48:43 |
Message-ID: | 20190612184843.etw4j733xhv7bzff@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 5, 2019 at 11:54:04AM +0900, Masahiko Sawada wrote:
> On Fri, May 10, 2019 at 2:42 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > I think we need to step back and see what we want to do. There are six
> > levels of possible encryption:
> >
> > 1. client-side column encryption
> > 2. server-side column encryption
> > 3. table-level
> > 4. database-level
> > 5. tablespace-level
> > 6. cluster-level
> >
> > 1 & 2 encrypt the data in the WAL automatically, and option 6 is
> > encrypting the entire WAL. This leaves 3-5 as cases where there will be
> > mismatch between the object-level encryption and WAL. I don't think it
> > is very valuable to use these options so reencryption will be easier.
> > In many cases, taking any object offline might cause the application to
> > fail, and having multiple encrypted data keys active will allow key
> > replacement to be done on an as-needed basis.
> >
>
> Summarizing the design discussion so far and the discussion I had at
> PGCon, there are several basic design items here. Each of them is
> loosely related and there are trade-off.
>
> 1. Encryption Levels.
> As Bruce suggested there are 6 levels. The fine grained control will
> help to suppress performance overheads of tables that we don't
> actually need to encrypt. Even in terms of security it might help
> since we don't give the key users who don't or cannot access to
> encrypted tables. But whichever we choose the level, we can protect
> data from attack bypassing PostgresSQL's ACL such as reading database
> file directly, as long as we encrypt data inside database. Threats we
> want to protect by has already gotten consensus so far, I think.
I think level 6 is an obvious must-have. I think the big question is
whether we gain enough by implementing levels 3-5 compared to the
complexity of the code and user interface.
The big question is how many people will be mixing encrypted and
unencrypted data in the same cluster, and care about performance? Just
because someone might care is not enough of a justification. They can
certainly create separate encrypted and non-encrypted clusters. Can we
implement level 6 and then implement levels 3-5 later if desired?
> Among these levels, the tablespace level would be somewhat different
> from others because it corresponds to physical directories rather than
> database objects. So in principles it's possible that tables are
> created on an encrypted tablespace while indexes are created on
> non-encrypted tablespace, which does not make sense though. But having
> less encryption keys would be better for simple architecture.
How would you configure the WAL to know which key to use if we did #5?
Wouldn't system tables and statistics, and perhaps referential integry
allow for information leakage?
> 2. Encryption Objects.
> Indexes, WAL and TOAST table pertaining to encrypted tables, and
> temporary files must also be encrypted but we need to discuss whether
> we encrypt non-user data as well such as SLRU data, vm and fsm, and
> perhaps even other files such as 2PC state files, backend_label etc.
> Encryption everything is required by some use case but it's also true
> that there are users who wish to encrypt database while minimizing
> performance overheads.
I don't think we need to encrypt the "status" files like SLRU data, vm
and fsm.
> 3. Encryption keys.
> Encryption levels would be relevant with the number of encryption keys
> we use. The database cluster levels would use single encryption key
> and can encrypt everything easier including non-user data such as xact
> WALs and SRLU data with the same key. On the other hand, for instance
> the table level would use multiple keys and can encrypt tables with
> different encryption keys. One advantage of having multiple keys in
> database would be that it can re-encrypt encrypted database object
> as-needed basis. For instance in multi tenant architecture, the
> stopping database cluster would affect all services but we can
> re-encrypt data one by one while minimizing downtime of each services
> if we use multiple keys. Even in terms of security, having multiple
> keys helps the diversification of risk.
I agree we need a 2 tier key hierarchy. See my pgcryptokey extension
as an example:
http://momjian.us/download/pgcryptokey/
> Apart from the above discussion, there are random concerns about the
> design regarding to the fine grained design. For WAL encryption, as a
> result of discussion so far I'm going to use the same encryption for
> WAL encryption as that used for tables. Given that approach, it would
> be required to make utility commands that read WAL (pg_waldump and
> pg_rewind) be able to get arbitrary encryption keys. pg_waldump might
> require even an encryption keys of WAL of which table has already been
> dropped. As I discussed at PGCon[3], by rearranging WAL format would
> solve this issue but it doesn't resolve fundamental issue.
Good point about pg_waldump. I am a little worried we might open a
security hole making a new API so they work, so maybe we should avoid
it.
> Also, for system catalog encryption, it could be a hard part. System
> catalogs are initially created at initdb time and created by copying
> from template1 when CREATE DATABASE. Therefore we would need to either
> modify initdb so that it's aware of encryption keys and KMS or modify
> database creation so that it copies database file while encrypting
> them.
I assume initdb will use the same API that you would use to start the
server itself, e.g., type in a password, or contact a key server.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-06-12 18:49:29 | Re: GiST limits on contrib/cube with dimension > 100? |
Previous Message | Alvaro Herrera | 2019-06-12 18:45:27 | Re: Quitting the thes |