From: | bricklen <bricklen(at)gmail(dot)com> |
---|---|
To: | Erik van Zijst <erik(dot)van(dot)zijst(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org list" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Suboptimal query plan when using expensive BCRYPT functions |
Date: | 2014-03-22 20:51:38 |
Message-ID: | CAGrpgQ_yi-kFq+02Nz2cORhK5o+X8phm3R_sPu7GwtUXUx02fQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Mar 21, 2014 at 5:59 PM, Erik van Zijst <erik(dot)van(dot)zijst(at)gmail(dot)com>wrote:
> Hi there,
>
> I've got a relatively simple query that contains expensive BCRYPT
> functions that gets optimized in a way that causes postgres to compute
> more bcrypt hashes than necessary, thereby dramatically slowing things
> down.
>
> In a certain part of our application we need to lookup users by their
> username, email address and password. Now we don't store plaintext
> passwords and so the query needs to compute bcrypt hashes on the fly:
>
> SELECT DISTINCT u.*
> FROM auth_user u
> JOIN bb_userprofile p ON p.user_id = u.id
> JOIN bb_identity i ON i.profile_id = p.id
> WHERE
> (
> (
> u.username ILIKE 'detkin'
> OR
> i.email ILIKE 'foo(at)example(dot)com'
> )
> AND
> (
> SUBSTRING(password FROM 8) = CRYPT(
> 'detkin', SUBSTRING(password FROM 8))
> )
> )
>
> These queries are generated by a parser that translates from an
> external query language to SQL run on the database. This test db
> contains 12 user records.
>
> With a single bcrypt hash taking ~300ms to compute, this is a recipe
> for disaster and so the app only allows queries that require only a
> very small number of bcrypt computation.
>
> E.g. the user must always "AND" the password lookup with a clause like
> " username = 'foo' AND password = 'bar'" (username is unique).
>
> However, while the query above technically only needs to compute 1
> hash (there is a user 'detkin' and email 'foo(at)example(dot)com' does not
> exist), it instead creates a query plan that computes hashes *before*
> filtering on username and email, leading to 12 hash computations and a
> very slow query.
>
> The EXPLAIN (ANALYZE, BUFFERS) is here: http://explain.depesz.com/s/yhE
>
> The schemas for the 3 tables involved are here:
> http://pgsql.privatepaste.com/f72020ad0a
>
> As a quick experiment I tried moving the joins and email lookup into a
> nested IN query, but that still generates a plan that computes hashes
> for all 12 users, before picking out the 1 whose username matches.
>
> Is there any way I can get postgres to perform the hash calculations
> on the *result* of the other parts of the where clause, instead of the
> other way around? Or else rewrite the query?
>
> Cheers,
> Erik
>
(untested), but how about something like the following:
WITH au AS (
SELECT DISTINCT u.*
FROM auth_user u
JOIN bb_userprofile p ON p.user_id = u.id
JOIN bb_identity i ON i.profile_id = p.id
WHERE u.username ILIKE 'detkin'
OR i.email ILIKE 'foo(at)example(dot)com')
SELECT au.*
FROM au
WHERE SUBSTRING(au.password FROM 8) = CRYPT('detkin', SUBSTRING(au.password
FROM 8));
From | Date | Subject | |
---|---|---|---|
Next Message | Erik van Zijst | 2014-03-22 22:27:41 | Re: Suboptimal query plan when using expensive BCRYPT functions |
Previous Message | Brett Wooldridge | 2014-03-22 11:26:34 | Re: Connection pooling - Number of connections |