Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Sehrope Sarkuni <sehrope(at)jackdb(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Joe Conway <mail(at)joeconway(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-15 22:08:28
Message-ID: CAH7T-are4DWvunDWknRcUGGLgv8H2FgPmQ8TTajCoztEadK+iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Some more thoughts on CBC vs CTR modes. There are a number of
advantages to using CTR mode for page encryption.

CTR encryption modes can be fully parallelized, whereas CBC can only
parallelized for decryption. While both can use AES specific hardware
such as AES-NI, CTR modes can go a step further and use vectorized
instructions.

On an i7-8559U (with AES-NI) I get a 4x speed improvement for
CTR-based modes vs CBC when run on 8K of data:

# openssl speed -evp ${cipher}
type 16 bytes 64 bytes 256 bytes 1024 bytes
8192 bytes 16384 bytes
aes-128-cbc 1024361.51k 1521249.60k 1562033.41k 1571663.87k
1574537.90k 1575512.75k
aes-128-ctr 696866.85k 2214441.86k 4364903.85k 5896221.35k
6559735.81k 6619594.75k
aes-128-gcm 642758.92k 1638619.09k 3212068.27k 5085193.22k
6366035.97k 6474006.53k
aes-256-cbc 940906.25k 1114628.44k 1131255.13k 1138385.92k
1140258.13k 1143592.28k
aes-256-ctr 582161.82k 1896409.32k 3216926.12k 4249708.20k
4680299.86k 4706375.00k
aes-256-gcm 553513.89k 1532556.16k 2705510.57k 3931744.94k
4615812.44k 4673093.63k

For relation data where the encryption is going to be per page,
there's flexibility in how the CTR nonce (IV + counter) is generated.
With an 8K page, the counter need only go up to 512 for each page
(8192-bytes per page / 16-bytes per AES-block). That would require
9-bits for the counter. Rounding that up to 16-bits allows for wider
pages and it still uses only two bytes of the counter while ensuring
that it'd be unique per AES-block. The remaining 14-bytes would be
populated with some other data that is guaranteed unique per
page-write to allow encryption via the same per-relation-file derived
key. From what I gather, the LSN is a candidate though it'd have to be
stored in plaintext for decryption.

What's important is that writing the two pages (either different
locations or the same page back again) never reuses the same nonce
with the same key. Using the same nonce with a different key is fine.

With any of these schemes the same inputs will generate the same
outputs. With CTR mode for WAL this would be an issue if the same key
and deterministic nonce (ex: LSN + offset) is reused in multiple
places. That does not have to be the same cluster either. For example
if two replicas are promoted from the same backup with the same master
key, they would generate the same WAL CTR stream, reusing the
key/nonce pair. Ditto for starting off with a master key and deriving
per-relation keys in a cloned installation off some deterministic
attribute such as oid.

This can be avoided by deriving new keys per file (not just per
relation) from a random salt. It'd be stored out of band and combined
with the master key to derive the specific key used for that CTR
stream. If there's a desire for supporting multiple ciphers or key
sizes, that could be stored alongside the salt. Perhaps use the same
location or lack of it to indicate "not encrypted" as well.

Per-file salts and derived keys would facilitate re-keying a table
piecemeal, file by file, by generating a new salt/derived-key,
encrypting a copy of the decrypted file, and doing an atomic rename.
The files contents would change but its length and any references to
pages or byte offsets would stay valid. (I think this would work for
CBC modes too as there's nothing CTR specific about it.)

I'm not sure of is how to handle randomizing the relation file IV in a
cloned database. Until the key for a relation file or segment is
rotated it'd have the same deterministic IV generated as its source as
the LSN would continue from the same point. One idea is with 128-bits
for the IV, one could have 64-bits for LSN, 16-bits for AES-block
counter, and the remaining 48-bits be randomized; though you'd need to
store those 48-bits somewhere per-page (basically it's a salt per
page). That'd give some protection from the clone's new data be
encrypted with the same stream as the parent's. Another option would
be to track ranges of LSNs and have a centralized list of 48-bit
randomized salts. That would remove the need for additional salt per
page though you'd have to do a lookup on that shared list to figure
out which to use.

CTR mode is definitely more complicated than a pure random-IV + CBC
but with any deterministic generation of IVs for CBC mode you're going
to have some of these same problems as well.

Regarding CRCs, CTR mode has the advantage of not destroying the rest
of the stream to replace the CRC bytes. With CBC mode any change would
cascade and corrupt the rest of data the down stream from that block.
With CTR mode you can overwrite the CRC's location with the CRC or a
truncated MAC of the encrypted data as each byte is encrypted
separately. At decryption time you simply ignore the decrypted output
of those bytes and zero them out again. A CRC of encrypted data (but
not a partial MAC) could be checked offline without access to the key.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-07-15 22:10:00 Re: POC: converting Lists into arrays
Previous Message Bruce Momjian 2019-07-15 22:05:37 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)