Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-09 21:42:01
Message-ID: 20190709214201.7bg76z7gwurlzrk3@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 09, 2019 at 03:50:39PM -0400, Bruce Momjian wrote:
>On Tue, Jul 9, 2019 at 02:09:38PM -0400, Joe Conway wrote:
>> On 7/9/19 11:11 AM, Bruce Momjian wrote:
>> > Good point about nonce and IV. I wonder if running the nonce
>> > through the cipher with the key makes it random enough to use as an
>> > IV.
>>
>> Based on that NIST document it seems so.
>>
>> The trick will be to be 100% sure we never reuse a nonce that is used
>> to produce the IV when using the same key.
>>
>> I think the potential to get that wrong (i.e. inadvertently reuse a
>> nonce) would lead to using the second described method
>>
>> "The second method is to generate a random data block using a
>> FIPS-approved random number generator."
>>
>> That method is what I am used to seeing. But with the second method
>> we need to store the IV, with the first we could reproduce it if we
>> select our initial nonce carefully.
>>
>> So thinking out loud, and perhaps you already said this Bruce, but I
>> guess the input nonce used to generate the IV could be something like
>> pg_class.oid and blocknum concatenated together with some delimiting
>> character as long as we guarantee that we generate different keys in
>> different databases. Then there would be no need to store the IV since
>> we could reproduce it.
>
>Uh, yes, and no. Yes, we can use the pg_class.oid (since it has to
>be preserved by pg_upgrade anyway), and the page number. However,
>different databases can have the same pg_class.oid/page number
>combination, so there would be duplication between databases. Now, you
>might say let's add the pg_database.oid, but unfortunately, because of
>the way we file-system-copy files from one database to another during
>database creation (it doesn't go through shared buffers), we can't use
>pg_database.oid as part of the nonce.
>
>My only idea here is that we actually decrypt/re-encrypted pages as we
>copy them at the file system level during database creation to match the
>new pg_database.oid. This would allow pg_database.oid in the nonce/IV.
>(I think we will need to modify pg_upgrade to preserve pg_database.oid.)
>
>If the nonce/IV is 96 bits, then that is 12 bytes or 3 4-byte values.
>pg_class.oid is 4 bytes, pg_database.oid is 4 bytes, and that leaves
>4-bytes for the block number, which gets us to 32TB before the page
>counter would overflow a 4-byte value, and our max table size is 32TB
>anyway, so that all works.
>

I don't think that works, because that'd mean we're encrypting the same
page with the same nonce over and over, which means reusing the reuse
(even if you hash/encrypt it). Or did I miss something?

There are two basic ways to construct nonces - CSPRNG and sequences, and
then a combination of both, i.e. one part is generated from a sequence
and one randomly.

FWIW not sure using OIDs as nonces directly is a good idea, as those are
inherently low entropy data - how often do you see databases with OIDs
above 1M or so? Probably not very often, and in most cases those are
databases where those OIDs are for OIDs and large objects, so irrelevant
for this purpose. I might be wrong but having a 96-bit nonce with maybe
just 32bits of entrophy seems suspicious.

That does not mean we can't use the OIDs at all, but maybe hashing them
into a single 4B value, and then picking the remaining 8B randomly.
Also, we have a "natural" sequence in the database - LSNs, maybe that
would be a good source of nonces too?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-07-09 21:45:52 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Stephen Frost 2019-07-09 21:31:58 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)