From: | Cary Huang <cary(dot)huang(at)highgo(dot)ca> |
---|---|
To: | "Cary Huang" <cary(dot)huang(at)highgo(dot)ca> |
Cc: | "Masahiko Sawada" <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Fabien COELHO" <coelho(at)cri(dot)ensmp(dot)fr>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Sehrope Sarkuni" <sehrope(at)jackdb(dot)com>, "cary huang" <hcary328(at)gmail(dot)com>, "Moon, Insung" <tsukiwamoon(dot)pgsql(at)gmail(dot)com>, "Ibrar Ahmed" <ibrar(dot)ahmad(at)gmail(dot)com>, "Joe Conway" <mail(at)joeconway(dot)com>, "Bruce Momjian" <bruce(dot)momjian(at)enterprisedb(dot)com> |
Subject: | Re: Internal key management system |
Date: | 2020-03-02 23:48:53 |
Message-ID: | 1709da70509.122865d61130010.5827932610677290190@highgo.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Masahiko
Please see below my comments regarding kms_v4.patch that you have shared earlier.
(1)
There is a discrepancy between the documentation and the actual function definition. For example in func.sgml, it states pg_wrap_key takes argument in text data type but in pg_proc.dat and kmgr.c, the function is defined to take argument in bytea data type.
===> doc/src/sgml/func.sgml
+ <entry>
+ <indexterm>
+ <primary>pg_wrap_key</primary>
+ </indexterm>
+ <literal><function>pg_wrap_key(<parameter>data</parameter> <type>text</type>)</function></literal>
+ </entry>
+ <entry>
+ <type>bytea</type>
+ </entry>
===> src/include/catalog/pg_proc.dat
+{ oid => '8201', descr => 'wrap the given secret',
+ proname => 'pg_wrap_key',
+ provolatile => 'v', prorettype => 'bytea',
+ proargtypes => 'bytea', prosrc => 'pg_wrap_key' },
===> src/backend/crypto/kmgr.c
+Datum
+pg_wrap_key(PG_FUNCTION_ARGS)
+{
+ bytea *data = PG_GETARG_BYTEA_PP(0);
+ bytea *res;
+ int datalen;
+ int reslen;
+ int len;
+
(2)
I think the documentation needs to make clear the difference between a key and a user secret. Some parts of it are trying to mix the 2 terms together when they shouldn't. To my understanding, a key is expressed as binary data that is actually used in the encryption and decryption operations. A user secret, on the other hand, is more like a passphrase, expressed as string, that is used to derive a encryption key for subsequent encryption operations.
If I just look at the function names "pg_wrap_key" and "pg_unwrap_key", I immediately feel that these functions are used to encapsulate and uncover cryptographic key materials. The input and output to these 2 functions should all be key materials expressed in bytea data type. In previous email discussion, there was only one key material in discussion, called the master key (generated during initdb and stored in cluster), and this somehow automatically makes people (including myself) associate pg_wrap_key and pg_unwrap_key to be used on this master key and raise a bunch of security concerns around it.
Having read the documentation provided by the patch describing pg_wrap_key and pg_unwrap_key, they seem to serve another purpose. It states that pg_wrap_key is used to encrypt a user-supplied secret (text) with the master key and produce a wrapped secret while pg_unwrap_key does the opposite, so we can prevent user from having to enter the secret in plaintext when using pgcrypto functions.
This use case is ok but user secret is not really a cryptographic key material is it? It is more similar to a secret passphrase expressed in text and pg_wrap_key is merely used to turn the passphrase into a wrapped passphrase expressed in bytea.
If I give pg_wrap_key with a real key material expressed in bytea, I will not be able to unwrap it properly:
Select pg_unwrap_key (pg_wrap_key(E'\\x2b073713476f9f0e761e45c64be8175424d2742e7d53df90b6416f1d84168e8a') );
pg_unwrap_key
----------------------------------------------
+\x077\x13Go\x0Ev\x1EEK\x17T$t.}SߐAo\x1D\x16
(1 row)
Maybe we should rename these SQL functions like this to prevent confusion.
=> pg_wrap_secret (takes a text, returns a bytea)
=> pg_unwrap_secret(takes a bytea, returns a text)
if there is a use case for users to encapsulate key materials then we can define 2 more wrap functions for these, if there is no use case, don't bother:
=> pg_wrap_key (takes a bytea, returns a bytea)
=> pg_unwrap_key (takes a bytea, returns a bytea)
(3)
I would rephrase "chapter 32: Encryption Key Management Part III. Server Administration" documentation like this:
=====================
PostgreSQL supports Encryption Key Management System, which is enabled when PostgreSQL is built with --with-openssl option and cluster_passphrase_command is specified during initdb process. The user-provided cluster_passphrase_command in postgresql.conf and the cluster_passphrase_command specified during initdb process must match, otherwise, the database cluster will not start up.
The user-provided cluster passphrase is derived into a Key Encryption Key (KEK), which is used to encapsulate the Master Encryption Key generated during the initdb process. The encapsulated Master Encryption Key is stored inside the database cluster.
Encryption Key Management System provides several functions to allow users to use the master encryption key to wrap and unwrap their own encryption secrets during encryption and decryption operations. This feature allows users to encrypt and decrypt data without knowing the actual key.
=====================
(4)
I would rephrase "chapter 32.2: Wrap and Unwrap user secret" documentation like this: Note that I rephrase based on (2) and uses pg_(un)wrap_secret instead.
=====================
Encryption key management System provides several functions described in Table 9.97, to wrap and unwrap user secrets with the Master Encryption Key, which is uniquely and securely stored inside the database cluster.
These functions allow user to encrypt and decrypt user data without having to provide user encryption secret in plain text. One possible use case is to use encryption key management together with pgcrypto. User wraps the user encryption secret with pg_wrap_secret() and passes the wrapped encryption secret to the pgcrypto encryption functions. The wrapped secret can be stored in the application server or somewhere secured and should be obtained promptly for cryptographic operations with pgcrypto.
[same examples follow after...]
=====================
Cary Huang
-------------
HighGo Software Inc. (Canada)
mailto:cary(dot)huang(at)highgo(dot)ca
---- On Tue, 25 Feb 2020 12:50:18 -0800 Cary Huang <mailto:cary(dot)huang(at)highgo(dot)ca> wrote ----
Hi
I would like to share with you a front end patch based on kms_v4.patch that you have shared, called "kms_v4_fe.patch".
The patch integrates front end tool pg_waldump with the KMSv4 and obtain encryption and decryption cipher contexts from the KMS backend. These cipher contexts can then be used in subsequent encryption and decryption operations provided by cipher.h when TDE is enabled. I added two common functions in your kmgr_utils that other front end tools affected by TDE can also use to obtain the cipher contexts. Do let me know if this is how you would envision KMS APIs to be used by a front end.
cheers
Cary Huang
-------------
HighGo Software Inc. (Canada)
mailto:cary(dot)huang(at)highgo(dot)ca
---- On Mon, 24 Feb 2020 17:55:09 -0800 Masahiko Sawada <mailto:masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote ----
On Thu, 20 Feb 2020 at 16:16, Masahiko Sawada
<mailto:masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Wed, 19 Feb 2020 at 09:29, Cary Huang <mailto:cary(dot)huang(at)highgo(dot)ca> wrote:
> >
> > Hi
> >
> > I have tried the attached kms_v3 patch and have some comments:
> >
> > 1) In the comments, I think you meant hmac + iv + encrypted data instead of iv + hmac + encrypted data?
> >
> > ---> in kmgr_wrap_key( ):
> > + /*
> > + * Assemble the wrapped key. The order of the wrapped key is iv, hmac and
> > + * encrypted data.
> > + */
>
> Right, will fix.
>
> >
> >
> > 2) I see that create_keywrap_ctx function in kmgr_utils.c and regular cipher context init will both call ossl_aes256_encrypt_init to initialise context for encryption and key wrapping. In ossl_aes256_encrypt_init, the cipher method always initialises to aes-256-cbc, which is ok for keywrap because under CBC block cipher mode, we only had to supply one unique IV as initial value. But for actual WAL and buffer encryption that will come in later, I think the discussion is to use CTR block cipher mode, which requires one unique IV for each block, and the sequence id from WAL and buffer can be used to derive unique IV for each block for better security? I think it would be better to allow caller to decide which EVP_CIPHER to initialize? EVP_aes_256_cbc(), EVP_aes_256_ctr() or others?
> >
> > +ossl_aes256_encrypt_init(pg_cipher_ctx *ctx, uint8 *key)
> > +{
> > + if (!EVP_EncryptInit_ex(ctx, EVP_aes_256_cbc(), NULL, NULL, NULL))
> > + return false;
> > + if (!EVP_CIPHER_CTX_set_key_length(ctx, PG_AES256_KEY_LEN))
> > + return false;
> > + if (!EVP_EncryptInit_ex(ctx, NULL, NULL, key, NULL))
> > + return false;
> > +
> > + /*
> > + * Always enable padding. We don't need to check the return
> > + * value as EVP_CIPHER_CTX_set_padding always returns 1.
> > + */
> > + EVP_CIPHER_CTX_set_padding(ctx, 1);
> > +
> > + return true;
> > +}
>
> It seems good. We can expand it to make caller decide the block cipher
> mode of operation and key length. I removed such code from the
> previous patch to make it simple since currently we support only
> AES-256 CBC.
>
> >
> > 3) Following up point 2), I think we should enhance the enum to include not only the Encryption algorithm and key size, but also the block cipher mode as well because having all 3 pieces of information can describe exactly how KMS is performing the encryption and decryption. So when we call "ossl_aes256_encrypt_init", we can include the new enum as input parameter and it will initialise the EVP_CIPHER_CTX with either EVP_aes_256_cbc() or EVP_aes_256_ctr() for different purposes (key wrapping, or WAL encryption..etc).
> >
> > ---> kmgr.h
> > +/* Value of key_management_cipher */
> > +enum
> > +{
> > + KMGR_CIPHER_OFF = 0,
> > + KMGR_CIPHER_AES256
> > +};
> > +
> >
> > so it would become
> > +enum
> > +{
> > + KMGR_CIPHER_OFF = 0,
> > + KMGR_CIPHER_AES256_CBC = 1,
> > + KMGR_CIPHER_AES256_CTR = 2
> > +};
> >
> > If you agree with this change, several other places will need to be changed as well, such as "kmgr_cipher_string", "kmgr_cipher_value" and the initdb code....
>
> KMGR_CIPHER_XXX is relevant with cipher mode used by KMS and KMS would
> still use AES256 CBC even if we had TDE which would use AES256 CTR.
>
> After more thoughts, I think currently we can specify -e aes-256 to
> initdb but actually this is not necessary. When
> --cluster-passphrase-command specified, we enable the internal KMS and
> always use AES256 CBC. Something like -e option will be needed after
> supporting TDE with several cipher options. Thoughts?
>
> >
> > 4) the pg_wrap_key and pg_unwrap_key SQL functions defined in kmgr.c
> > I tried these new SQL functions and found that the pg_unwrap_key will produce the original key with 4 bytes less. This is because the result length is not set long enough to accommodate the 4 byte VARHDRSZ header used by the multi-type variable.
> >
> > the len variable in SET_VARSIZE(res, len) should include also the variable header VARHDRSZ. Now it is 4 byte short so it will produce incomplete output.
> >
> > ---> pg_unwrap_key function in kmgr.c
> > + if (!kmgr_unwrap_key(UnwrapCtx, (uint8 *) VARDATA_ANY(data), datalen,
> > + (uint8 *) VARDATA(res), &len))
> > + ereport(ERROR,
> > + (errmsg("could not unwrap the given secret")));
> > +
> > + /*
> > + * The size of unwrapped key can be smaller than the size estimated
> > + * before unwrapping since the padding is removed during unwrapping.
> > + */
> > + SET_VARSIZE(res, len);
> > + PG_RETURN_BYTEA_P(res);
> >
> > I am only testing their functionalities with random key as input data. It is currently not possible for a user to obtain the wrapped key from the server in order to use these wrap/unwrap functions. I personally don't think it is a good idea to expose these functions to user
>
> Thank you for testing. I'm going to include regression tests and
> documentation in the next version patch.
>
Attached the updated version patch. In this version, I've removed -e
option of initdb that was used to specify the encryption algorithm and
key length like aes-256. The cipher algorithm and key length used by
KMS is fixed, aes-256, so it's no longer necessary as long as we
support only KMS. When we introduce transparent data encryption and
we'd like to support multiple options we will have such option.
Therefore, the internal KMS is enabled when PostgreSQL is built with
--with-openssl and --cluster-passphrase-command is specified to
initdb. The patch includes minimal doc and regression tests.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2020-03-03 00:03:52 | Re: Improve search for missing parent downlinks in amcheck |
Previous Message | Andres Freund | 2020-03-02 23:24:21 | Re: Improving connection scalability: GetSnapshotData() |