From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Kincaid <tomjohnkincaid(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Subject: | Re: storing an explicit nonce |
Date: | 2021-05-26 19:48:27 |
Message-ID: | CA+TgmoY0KBnTJzwAx6WqKNNyistzEstOnEuY7R53ZPX2_yoiUg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 26, 2021 at 2:37 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Anybody got a better idea?
>
> If we stipulate (and document) that all replicas need their own keys
> then we no longer need to worry about nonce re-use between the primary
> and the replica. Not sure that's *better*, per se, but I do think it's
> worth consideration. Teaching pg_basebackup how to decrypt and then
> re-encrypt with a different key wouldn't be challenging.
I agree that we could do that and that it's possible worth
considering. However, it would be easy - and tempting - for users to
violate the no-nonce-reuse principle. For example, consider a
hypothetical user who takes a backup on Monday via a filesystem
snapshot - which might be either (a) a snapshot of the cluster while
it is stopped, or (b) a snapshot of the cluster while it's running,
from which crash recovery can be safely performed as long as it's a
true atomic snapshot, or (c) a snapshot taken between pg_start_backup
and pg_stop_backup which will be used just like a backup taken by
pg_basebackup. In any of these cases, there's no opportunity for a
tool we provide to intervene and re-key. Now, we would provide a tool
that re-keys in such situations and tell people to be sure they run it
before using any of those backups, and maybe that's the best we can
do. However, that tool is going to run for a good long time because it
has to rewrite the entire cluster, so someone with a terabyte-scale
database is going to be sorely tempted to skip this "unnecessary" and
time-consuming step. If it were possible to set things up so that good
things happen automatically and without user action, that would be
swell.
Here's another idea: suppose that a nonce is 128 bits, 64 of which are
randomly generated at server startup, and the other 64 of which are a
counter. If you're willing to assume that the 64 bits generated
randomly at server startup are not going to collide in practice,
because the number of server lifetimes per key should be very small
compared to 2^64, then this gets you the benefits of a
randomly-generate nonce without needing to keep on generating new
cryptographically strong random numbers, and pretty much regardless of
what users do with their backups. If you replay an FPI, you can write
out the page exactly as you got it from the master, without
re-encrypting. If you modify and then write a page, you generate a
nonce for it containing your own server lifetime identifier.
> Yes, if the amount of space available is variable then there's an added
> cost for that. While I appreciate the concern about having that be
> expensive, for my 2c at least, I like to think that having this sudden
> space that's available for use may lead to other really interesting
> capabilities beyond the ones we're talking about here, so I'm not really
> thrilled with the idea of boiling it down to just two cases.
Although I'm glad you like some things about this idea, I think the
proposed system will collapse if we press it too hard. We're going to
need to be judicious.
> One thing to be absolutely clear about here though is that simply taking
> a hash() of the ciphertext and storing that with the data does *not*
> provide cryptographic data integrity validation for the page because it
> doesn't involve the actual key or IV at all and the hash is done after
> the ciphertext is generated- therefore an attacker can change the data
> and just change the hash to match and you'd never know.
Ah, right. So you'd actually want something more like
hash(dboid||tsoid||relfilenode||blockno||block_contents||secret).
Maybe not generated exactly that way: perhaps the secret is really the
IV for the hash function rather than part of the hashed data, or
whatever. However you do it exactly, it prevents someone from
verifying - or faking - a signature unless they have the secret.
> very hard for the attacker to discover) and suddently you're doing what
> AES-GCM *already* does for you, except you're trying to hack it yourself
> instead of using the tools available which were written by experts.
I am all in favor of using the expert-written tools provided we can
figure out how to do it in a way we all agree is correct.
> What this means for your proposal above is that the actual data
> validation information will be generated in two different ways depending
> on if we're using AES-GCM and doing TDE, or if we're doing just the data
> validation piece and not encrypting anything. That's maybe not ideal
> but I don't think it's a huge issue either and your proposal will still
> address the question of if we end up missing anything when it comes to
> how the special area is handled throughout the code.
Hmm. Is there no expert-written method for this sort of thing without
encryption? One thing that I think would be really helpful is to be
able to take a TDE-ified cluster and run it through decryption, ending
up with a cluster that still has extra special space but which isn't
actually encrypted any more. Ideally it can end up in a state where
integrity validation still works. This might be something people just
Want To Do, and they're willing to sacrifice the space. But it would
also be real nice for testing and debugging. Imagine for example that
the data on page X is physiologically corrupted i.e. decryption
produces something that looks like a page, but there's stuff wrong
with it, like the item pointers point to a page offset greater than
the page size. Well, what you really want to do with this page is run
pg_filedump on it, or hexdump, or od, or pg_hexedit, or whatever your
favorite tool is, so that you can figure out what's going on, but
that's going to be hard if the pages are all encrypted.
I guess nothing in what you are saying really precludes that, but I
agree that if we have to switch up the method for creating the
integrity verifier thing in this situation, that's not great.
> If it'd help, I'd be happy to jump on a call to discuss further. Also
> happy to continue on this thread too, of course.
I am finding the written discussion to be helpful right now, and it
has the advantage of being easy to refer back to later, so my vote
would be to keep doing this for now and we can always reassess if it
seems to make sense.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2021-05-26 19:49:43 | Re: storing an explicit nonce |
Previous Message | Stephen Frost | 2021-05-26 19:47:34 | Re: storing an explicit nonce |