Re: XTS cipher mode for cluster file encryption

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XTS cipher mode for cluster file encryption
Date: 2021-10-15 19:22:48
Message-ID: 20211015192248.GP20998@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> As you might have seen from my email in another thread, thanks to
> Stephen and Cybertec staff, I am back working on cluster file
> encryption/TDE.
>
> Stephen was going to research if XTS cipher mode would be a good fit for
> this since it was determined that CTR cipher mode was too vulnerable to
> IV reuse, and the LSN provides insufficient uniqueness. Stephen
> reported having trouble finding a definitive answer, so I figured I
> would research it myself.
>
> Of course, I found the same lack of information that Stephen did. ;-)
> None of my classic cryptographic books cover XTS or the XEX encryption
> mode it is based on, since XTS was only standardized in 2007 and
> recommended in 2010. (Yeah, don't get me started on poor cryptographic
> documentation.)
>
> Therefore, I decide to go backward and look at CTR and CBC to see how
> the nonce is used there, and if XTS fixes problems with nonce reuse.
>
> First, I originally chose CTR mode since it was a streaming cipher, and
> we therefore could skip certain page fields like the LSN. However, CTR
> is very sensitive to LSN reuse since the input bits generate encrypted
> bits in exactly the same locations on the page. (It uses a simple XOR
> against a cipher). Since sometimes pages with different page contents
> are encrypted with the same LSN, especially on replicas, this method
> failed.
>
> Second is CBC mode. which is a block cipher. I thought that meant that
> you could only encrypt 16-byte chunks, meaning you couldn't skip
> encryption of certain page fields unless they are 16-byte chunks.
> However, there is something called ciphertext stealing
> (https://en.wikipedia.org/wiki/Ciphertext_stealing#CBC_ciphertext_stealing)
> which allows that. I am not sure if OpenSSL supports this, but looking
> at my OpenSSL 1.1.1d manual entry for EVP_aes, cipher stealing is only
> mentioned for XTS.
>
> Anyway, CBC mode still needs a nonce for the first 16-byte block, and
> then feeds the encrypted output of the first block as a IV to the second
> block, etc. This gives us the same problem with finding a nonce per
> page. However, since it is a block cipher, the bits don't output in the
> same locations they have on input, so that is less of a problem. There
> is also the problem that the encrypted output from one 16-byte block
> could repeat, causing leakage.
>
> So, let's look how XTS is designed. First, it uses two keys. If you
> are using AES128, you need _two_ 128-bit keys. If using AES256, you
> need two 256-bit keys. The first of the two keys is used like normal,
> to encrypt the data. The second key, which is also secret, is used to
> encrypt the values used for the IV for the first 16-byte block (in our
> case dboid, relfilenode, blocknum, maybe LSN). This is most clearly
> explained here:
>
> https://www.kingston.com/unitedstates/en/solutions/data-security/xts-encryption
>
> That IV is XOR'ed against both the input value and the encryption output
> value, as explained here as key tweaking:
>
> https://crossbowerbt.github.io/xts_mode_tweaking.html
>
> The purpose of using it before and after encryption is explained here:
>
> https://crypto.stackexchange.com/questions/24431/what-is-the-benefit-of-applying-the-tweak-a-second-time-using-xts
>
> The second 16-byte block gets an IV that is the multiplication of the
> first IV and an alpha value raised to the second power but mapped to a
> finite field (Galois field, modulus a prime). This effectively means an
> attacker has _no_ idea what the IV is since it involves a secret key,
> and each 16-byte block uses a different, unpredictable IV value. XTS
> also supports ciphertext stealing by default so we can use the LSN if we
> want, but we aren't sure we need to.

Yeah, this all seems to be about where I got to too.

> Finally, there is an interesting web page about when not to use XTS:
>
> https://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/

This particular article always struck me as more of a reason for us, at
least, to use XTS than to not- in particular the very first comment it
makes, which seems to be pretty well supported, is: "XTS is the de-facto
standard disk encryption mode." Much of the rest of it is the well
trodden discussion we've had about how FDE (or TDE in our case) doesn't
protect against all the attack vectors that sometimes people think it
does. Another point is that XTS isn't authenticated- something else we
know quite well around here and isn't news.

> Basically, what XTS does is to make the IV unknown to attackers and
> non-repeating except for multiple writes to a specific 16-byte block
> (with no LSN change). What isn't clear is if repeated encryption of
> different data in the same 16-byte block can leak data.

Any time a subset of the data is changed but the rest of it isn't,
there's a leak of information. This is a really good example of exactly
what that looks like:

https://github.com/robertdavidgraham/ecb-penguin

In our case, if/when this happens (no LSN change, repeated encryption
of the same block), someone might be able to deduce that hint bits were
being updated/changed, and where some of those are in the block.

That said, I don't think that's really a huge issue or something that's
a show stopper or a reason to hold off on using XTS. Note that what
those bits actually *are* isn't leaked, just that they changed in some
fashion inside of that 16-byte cipher text block. That they're directly
leaked with CTR is why there was concern raised about using that method,
as discussed above and previously.

> This probably needs more research and maybe we need to write something
> up like the above and let security researchers review it since there
> doesn't seem to be enough documentation for us to decide ourselves.

The one issue identified here is hopefully answered above and given that
what you've found matches what I found, I'd argue that moving forward
with XTS makes sense.

The other bit of research that I wanted to do, and thanks for sending
this and prodding me to go do so, was to look at other implementations
and see what they do for the IV when it comes to using XTS, and this is
what I found:

https://wiki.gentoo.org/wiki/Dm-crypt_full_disk_encryption

Specifically: The default cipher for LUKS is nowadays aes-xts-plain64

and then this:

https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMCrypt

where plain64 is defined as:

plain64: the initial vector is the 64-bit little-endian version of the
sector number, padded with zeros if necessary

That is, the default for LUKS is AES, XTS, with a simple IV. That
strikes me as a pretty ringing endorsement.

Now, to address the concern around re-encrypting a block with the same
key+IV but different data and leaking what parts of the page changed, I
do think we should use the LSN and have it change regularly (including
unlogged tables) but that's just because it's relatively easy for us to
do and means an attacker wouldn't be able to tell what part of the page
changed when the LSN was also changed. That was also recommended by
NIST and that's a pretty strong endorsement also.

I'm all for getting security folks and whomever else to come and review
this thread and chime in with their thoughts, but I don't think it's a
reason to hold off on moving forward with the approach that we've been
converging towards.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2021-10-15 19:31:33 Re: Partial aggregates pushdown
Previous Message Bruce Momjian 2021-10-15 18:51:29 Re: [PATCH] Proposal for HIDDEN/INVISIBLE column