From: | Jamie Koceniak <jkoceniak(at)mediamath(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #14399: Order by id DESC causing bad query plan |
Date: | 2016-11-02 15:31:12 |
Message-ID: | E1ECB3A1-ECD9-4694-AE32-DD13C9CC9E26@mediamath.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi David,
Thanks for the suggestion on rewriting the query.
Unfortunately, it yields the same performance problem.
<https://explain.depesz.com/s/fab>https://explain.depesz.com/s/fab
Query rewritten (sorry did leave out function call on original email):
select * FROM
orders t1
JOIN customer t2 ON (t1.customer_id = t2.id<http://t2.id>)
WHERE
t2.id<http://t2.id> EXISTS (select 1 from valid_customers(15348) t3 where t3.customer_id = t2.id)
ORDER BY t1.id<http://t1.id> DESC
LIMIT 10 ;
We also do pagination and if you add a limit 10 offset 90 for example, the performance is 10 times as worse.
If you actually sort by a non-indexed field, then the query runs in 37ms.
Here is the query plan using non-indexed field:
https://explain.depesz.com/s/BtF2
So query sorted by non-indexed field looks like:
select * FROM
orders t1
JOIN customer t2 ON (t1.customer_id = t2.id<http://t2.id>)
WHERE
t2.id<http://t2.id> EXISTS (select 1 from valid_customers(15348) t3 where t3.customer_id = t2.id)
ORDER BY t1.<http://t1.id>created_on desc
LIMIT 10 ;
Thanks,
Jamie
From: David Johnston <david(dot)g(dot)johnston(at)gmail(dot)com<mailto:david(dot)g(dot)johnston(at)gmail(dot)com>>
Date: Tuesday, November 1, 2016 at 4:41 PM
To: Jamie Koceniak <jkoceniak(at)mediamath(dot)com<mailto:jkoceniak(at)mediamath(dot)com>>
Cc: "pgsql-bugs(at)postgresql(dot)org<mailto:pgsql-bugs(at)postgresql(dot)org>" <pgsql-bugs(at)postgresql(dot)org<mailto:pgsql-bugs(at)postgresql(dot)org>>
Subject: Re: [BUGS] BUG #14399: Order by id DESC causing bad query plan
On Thu, Oct 27, 2016 at 5:16 PM, <jkoceniak(at)mediamath(dot)com<mailto:jkoceniak(at)mediamath(dot)com>> wrote:
The following bug has been logged on the website:
Bug reference: 14399
Logged by: Jamie Koceniak
Email address: jkoceniak(at)mediamath(dot)com<mailto:jkoceniak(at)mediamath(dot)com>
PostgreSQL version: 9.4.6
Operating system: Linux
Description:
One table has 2M records (orders) joining to another table with 75K records
(customers).
Query:
select * FROM
orders t1
JOIN customer t2 ON (t1.customer_id = t2.id<http://t2.id>)
WHERE
t2.id<http://t2.id> IN (select distinct customer_id from valid_customers)
ORDER BY t1.id<http://t1.id>
LIMIT 10 ;
Bug potential aside the better way to write that is to use a proper semi-join (i.e., EXISTS)
SELECT *
FROM order t1
JOIN customer t2 ON (t1.customer_id = t2.id<http://t2.id>)
WHERE EXISTS (SELECT 1 FROM valid_customers t3 WHERE t3.customer_id = t2.id<http://t2.id>)
ORDER BY t1.id<http://t1.id>
LIMIT 10;
Note too that your query plan has a "function scan" node unlike what your query implies...
Sorry I can't be of more help with the information you've provided.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-11-02 15:52:16 | Re: Problems with "pg.dropped" column after upgrade 9.5 to 9.6 |
Previous Message | Tom Lane | 2016-11-02 14:35:04 | Re: collector's time is wrong.(year 2038 problem) |