From: | Bear Giles <bgiles(at)coyotesong(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Danylo Hlynskyi <abcz2(dot)uprola(at)gmail(dot)com> |
Subject: | Re: [FEATURE REQUEST] Encrypted indexes over encrypted data |
Date: | 2018-08-09 18:27:55 |
Message-ID: | CALBNtw4c=MRPXfCPeEnFnb44_5pBOBGYFpp7THj0QLKyaSBc9A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
There are alternatives. If you know what you want to find, e.g., a search
by username or email address, you can store a strong hash of the value as
an indexed column. By "strong hash" I mean don't just use md5 or sha1, or
even one round with a salt. I can give you more details about how and why
offline.
So you might have a record with:
id serial primary key,
email_hash text not null indexed,
first_name_hash text indexed,
last_name_hash text indexed,
phone_number_hash text indexed ,
'wallet'_containing_all_encrypted_values text
and that allows you to search on email, first name, last name, or phone
number, or some combination on them. But no expressions. The hashing would
be done in your app, not the database. You also probably want to convert
everything to lowercase, maybe remove spaces, etc., before computing the
hash.
You should be prepared to handle multiple matches. It's unlikely that an
email or phone number hash won't be unique but it's safest to always be
prepared for more than one match, decrypt the 'wallet', and then do a final
comparison. That also gives you a bit of protection from an attacker
creating an account and then changing the hash values to match someone
else. You can use that to support very limited expressions, e.g., also keep
a hash on the first three letters of their last name, but that will
compromise your security a bit since it allows an attacker to perform some
statistical analysis on the data.
Finally there's the general advice that hashes (and encrypted values)
should always have a version number of some sort. It could be something as
simple as 3$hash, or it could be a composite column or even a user-defined
type. The # indicates is a lookup into a table, perhaps in your app, that
tells you which hashing algorithm and salt to use. It makes life a lot
easier if the security audit tells you that you need to change your
cipher/salt/key/whatever but you can't do it immediately since you don't
know everything you need in order to do it, e.g., the password that you
need in order to recompute the hash value. With that version number it's
easy to continue to accept the existing password so they can log in, and in
the background you quietly recompute the hash using the new
salt/algorithm/whatever and update their record. I've worked for some
pretty knowledgeable companies that have overlooked this.
On Thu, Aug 9, 2018 at 6:05 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>
> On August 9, 2018 5:30:26 PM GMT+05:30, Danylo Hlynskyi <
> abcz2(dot)uprola(at)gmail(dot)com> wrote:
> > ?Is it hard to implement soluition 2?
>
> Yes.
>
> To the point that I'm fairly certain that an implementation would be
> considered to costly to maintain (vs benefit) of proposed.
>
> Andres
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-08-09 18:32:58 | Re: libpq should not look up all host addresses at once |
Previous Message | Andrew Dunstan | 2018-08-09 18:26:28 | Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11) |