Re: Poor query performance on one of two "like" databases in production.

From: <Keaton_Adams(at)McAfee(dot)com>
To: <Keaton_Adams(at)McAfee(dot)com>, <schmiddy(at)gmail(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Poor query performance on one of two "like" databases in production.
Date: 2010-05-14 21:04:02
Message-ID: C81316E2.1670E%keaton_adams@mcafee.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

It looks like it is just a difference in data volume. We are re-working the query to see what that will do.

Thanks for the suggestions.

-K

On 5/14/10 2:23 PM, "Adams, Keaton" <Keaton_Adams(at)McAfee(dot)com> wrote:

OK,

So if I change the NOT IN clause the query will run with a MERGE JOIN and complete in about 20 seconds. Have a look at the logic I am following and see if it makes sense. Might this just be a case where because there is more data in one DB compared to another (even though the counts are "close"), that's the difference in this whole thing?

-K

postgres@> more badquery.sql
EXPLAIN SELECT substring(users.email from '^.*\@') || domains.domain as email,
users.customer_id,
users.password,
p.policy_set_id,
users.user_id,
domains.auth_type
FROM
(
SELECT d.customer_id, d.domain, d.domain_id, d.auth_type, d.active from mxl_domain d
UNION ALL
SELECT d.customer_id, da.domain, da.domain_id, d.auth_type, da.active from mxl_domain d, mxl_domain_alias da WHERE da.domain_id = d.domain_id
) as domains,
(
SELECT u.email, u.customer_id, u.user_id, u.domain_id, u.password, u.active from mxl_user u
UNION ALL
SELECT ua.email, u.customer_id, u.user_id, u.domain_id, u.password, ua.active from mxl_user u, mxl_user_alias ua WHERE ua.user_id = u.user_id
) as users,
wds_policy_set p
WHERE users.customer_id = p.id
AND users.customer_id = domains.customer_id
AND users.domain_id = domains.domain_id
AND p.default_flag = 1
AND p.web_access_flag = 1
AND p.scope = 3
AND domains.active = 1
AND users.active != 0
AND p.active = 1
AND users.user_id NOT IN (SELECT user_id FROM mxl_user_group mug WHERE mug.user_id = users.user_id);

postgres@ time psql -Upostgres -dmxl -fbadquery.sql
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=3522288.96..3534654.13 rows=245730 width=1564)
Merge Cond: ((p.id = u.customer_id) AND (d.domain_id = u.domain_id))
-> Sort (cost=38160.35..38471.30 rows=124381 width=536)
Sort Key: p.id, d.domain_id
-> Hash Join (cost=578.15..10203.83 rows=124381 width=536)
Hash Cond: (d.customer_id = p.id)
-> Append (cost=0.00..4989.45 rows=77541 width=528)
-> Seq Scan on mxl_domain d (cost=0.00..1810.88 rows=64390 width=28)
Filter: (active = 1)
-> Subquery Scan "*SELECT* 2" (cost=2454.78..3178.57 rows=13151 width=528)
-> Hash Join (cost=2454.78..3047.06 rows=13151 width=33)
Hash Cond: (da.domain_id = d.domain_id)
-> Seq Scan on mxl_domain_alias da (cost=0.00..296.39 rows=13151 width=25)
Filter: (active = 1)
-> Hash (cost=1649.90..1649.90 rows=64390 width=12)
-> Seq Scan on mxl_domain d (cost=0.00..1649.90 rows=64390 width=12)
-> Hash (cost=525.74..525.74 rows=4193 width=8)
-> Seq Scan on wds_policy_set p (cost=0.00..525.74 rows=4193 width=8)
Filter: ((default_flag = 1) AND (web_access_flag = 1) AND (active = 1) AND (scope = 3))
-> Materialize (cost=3484128.61..3497039.21 rows=1032848 width=1044)
-> Sort (cost=3484128.61..3486710.73 rows=1032848 width=1044)
Sort Key: u.customer_id, u.domain_id
-> Append (cost=0.00..2826808.61 rows=1032848 width=1044)
-> Index Scan using mxl_user_domain_id_idx on mxl_user u (cost=0.00..2416377.66 rows=906921 width=52)
Filter: ((active <> 0) AND (NOT (subplan)))
SubPlan
-> Index Scan using mxl_user_group_uid_idx on mxl_user_group mug (cost=0.00..2.47 rows=1 width=4)
Index Cond: (user_id = $0)
-> Subquery Scan "*SELECT* 2" (cost=8.23..410430.95 rows=125927 width=1044)
Filter: (NOT (subplan))
-> Merge Join (cost=8.23..95536.63 rows=251854 width=55)
Merge Cond: (u.user_id = ua.user_id)
-> Index Scan using mxl_user_pkey on mxl_user u (cost=0.00..77679.47 rows=1881318 width=28)
-> Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua (cost=0.00..10109.21 rows=251854 width=31)
Filter: (ua.active <> 0)
SubPlan
-> Index Scan using mxl_user_group_uid_idx on mxl_user_group mug (cost=0.00..2.47 rows=1 width=4)
Index Cond: (user_id = $0)
(38 rows)

postgres@ time psql -Upostgres -dmxl -fbadquery.sql -okda.out

real 0m22.645s
user 0m1.565s
sys 0m0.246s
postgres@> wc -l kda.out
285563 kda.out

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2010-05-14 21:09:21 Re: List traffic
Previous Message Richard Broersma 2010-05-14 20:46:36 ALTER DOMAIN feature request