BUG #13792: Weird querry planner behavior

From: ydolgikh(at)jerasoft(dot)net
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #13792: Weird querry planner behavior
Date: 2015-12-03 16:06:44
Message-ID: 20151203160644.5889.21946@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 13792
Logged by: Yurii Dolhikh
Email address: ydolgikh(at)jerasoft(dot)net
PostgreSQL version: 9.3.10
Operating system: CentOS
Description:

Reporting as a bug as suggested by Craig Ringer.

Stackoverflow thread:
http://stackoverflow.com/questions/34064639/postgresql-weird-querry-planner-behavior

Assume I have a query like this:

SELECT *
FROM clients c
INNER JOIN clients_balances cb ON cb.id_clients = c.id
LEFT JOIN clients com ON com.id = c.id_companies
LEFT JOIN clients com_real ON com_real.id = c.id_companies_real
LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table
LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table
LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms
LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND
cpgw.id_companies = c.id_companies_real)
WHERE
EXISTS (SELECT * FROM accounts WHERE (name LIKE 'x' OR accname LIKE 'x' OR
ani LIKE 'x') AND id_clients = c.id)
AND c."type" = '0'
AND c."id" > 0
ORDER BY c."name";
This query takes around 35 seconds to run when used in the production
environment ("clients" has about 1 million records). However, if I take out
ANY join - the query will take only about 300 ms to execute.

I've played around with the query planner settings, but to no avail.

Here are the two explain analyze outputs:

http://explain.depesz.com/s/hzy (slow - 48049.574 ms)
http://explain.depesz.com/s/FWCd (fast - 286.234 ms, rate_tables JOIN
removed)
http://explain.depesz.com/s/MyRf (fast - 539.733 ms, paygw_clients_profiles
JOIN removed)
It looks like in the fast case the planner starts from the EXISTS statement
and has to perform join for only two rows in total. However, in the slow
case it will first join all the tables and then filter by EXISTS.

What I need to do is to make this query run in a reasonable time with all
seven join in place.

Postgres version is 9.3.10 on CentOS 6.3.

Thanks.

UPDATE

Rewriting the query like this:

SELECT *
FROM clients c
INNER JOIN clients_balances cb ON cb.id_clients = c.id
INNER JOIN accounts a ON a.id_clients = c.id AND (a.name = 'x' OR
a.accname = 'x' OR a.ani = 'x')
LEFT JOIN clients com ON com.id = c.id_companies
LEFT JOIN clients com_real ON com_real.id = c.id_companies_real
LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table
LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table
LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms
LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND
cpgw.id_companies = c.id_companies_real)
WHERE
c."type" = '0' AND c.id > 0
ORDER BY c."name";
makes it run fast, however, this is not acceptable, as account filtration
parameters are optional, and I still need the result if there are no matches
in that table. Using "LEFT JOIN accounts" instead of "INNER JOIN accounts"
kills the performance again.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message tomek 2015-12-03 16:55:20 BUG #13793: Please implement IP_FREEBIND option
Previous Message Daniel Verite 2015-12-03 13:08:41 Re: Possible bug: ALTER TABLE x DROP COLUMN y "column ... does not exist" then it does