Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "Moon, Insung" <Moon_Insung_i3(at)lab(dot)ntt(dot)co(dot)jp>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Date: 2019-07-19 10:04:36
Message-ID: 5339.1563530676@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:

> On Mon, Jul 15, 2019 at 03:42:39PM -0400, Bruce Momjian wrote:
> >On Sat, Jul 13, 2019 at 11:58:02PM +0200, Tomas Vondra wrote:
> >> One extra thing we should consider is authenticated encryption. We can't
> >> just encrypt the pages (no matter which AES mode is used - XTS/CBC/...),
> >> as that does not provide integrity protection (i.e. can't detect when
> >> the ciphertext was corrupted due to disk failure or intentionally). And
> >> we can't quite rely on checksums, because that checksums the plaintext
> >> and is stored encrypted.
> >
> >Uh, if someone modifies a few bytes of the page, we will decrypt it, but
> >the checksum (per-page or WAL) will not match our decrypted output. How
> >would they make it match the checksum without already knowing the key.
> >I read [1] but could not see that explained.
> >
>
> Our checksum is only 16 bits, so perhaps one way would be to just
> generate 64k of randomly modified pages and hope one of them happens to
> hit the right checksum value. Not sure how practical such attack is, but
> it does require just filesystem access.

I don't think you can easily generate 64k of different checksums this way. If
the data is random, I suppose that each set of 2^(128 - 16) blocks will
contain the the same checksum after decryption. Thus even you generate 64k of
different ciphertext blocks that contain the checksum, some (many?) checksums
will be duplicate. Unfortunately the math to describe this problem does not
seem to be trivial.

Also note that if you try to generate ciphertext, decryption of which will
result in particular value of checksum, you can hardly control the other 14
bytes of the block, which in turn are used to verify the checksum.

> FWIW our CRC algorithm is not quite HMAC, because it's neither keyed nor
> a cryptographic hash algorithm. Now, maybe we don't want authenticated
> encryption (e.g. XTS is not authenticated, unlike GCM/CCM).

I'm also not sure if we should try to guarantee data authenticity /
integrity. As someone already mentioned elsewhere, page MAC does not help if
the whole page is replaced. (An extreme case is that old filesystem snapshot
containing the whole data directory is restored, although that will probably
make the database crash soon.)

We can guarantee integrity and authenticity of backup, but that's a separate
feature: someone may need this although it's o.k. for him to run the cluster
unencrypted.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2019-07-19 10:45:05 Re: [PATCH] minor bugfix for pg_basebackup (9.6 ~ )
Previous Message Michael Paquier 2019-07-19 09:51:08 Re: Compile from source using latest Microsoft Windows SDK