From: | Qingqing Zhou <zhouqq(dot)postgres(at)gmail(dot)com> |
---|---|
To: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Our trial to TPC-DS but optimizer made unreasonable plan |
Date: | 2015-08-26 21:38:34 |
Message-ID: | CAJjS0u0XWkBE5MfLqwWkKZZ20AZP+5jjsi+EuUOCmJGLDqX0oA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 19, 2015 at 10:32 AM, Qingqing Zhou
<zhouqq(dot)postgres(at)gmail(dot)com> wrote:
> On Tue, Aug 18, 2015 at 5:59 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>> BTW, did you register the patch on the upcoming commit-fest?
>>
> Not yet, it is in WIP status.
>
While I am working on the patch, I found some issues and resort help
here. Patch attached.
Here is an example:
postgres=# explain WITH q AS (
WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on p.i>=p1.i)
SELECT * FROM q WHERE i <= 5;
QUERY PLAN
----------------------------------------------------------------------------------
Nested Loop (cost=0.58..5980.16 rows=133333 width=8)
-> Index Scan using ai on a (cost=0.29..8.36 rows=4 width=8)
Index Cond: (i <= 5)
-> Index Only Scan using ai on a a_1 (cost=0.29..1159.62
rows=33333 width=4)
Index Cond: (i <= a.i)
(5 rows)
So far so good. But if we add other references of the CTE q (m1->m,
m->q), we still have some extra CTE scans:
postgres=# explain WITH q AS (
WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on
p.i>=p1.i), m as (select * from q), m1 as (select * from m)
SELECT * FROM m1 WHERE i <= 5;
QUERY PLAN
-----------------------------------------------------------------------------------------
CTE Scan on m (cost=158365985.66..233365985.65 rows=1111111111 width=8)
Filter: (i <= 5)
CTE q
-> Nested Loop (cost=0.29..91699319.00 rows=3333333333 width=8)
-> Seq Scan on a (cost=0.00..1443.00 rows=100000 width=8)
-> Index Only Scan using ai on a a_1 (cost=0.29..583.65
rows=33333 width=4)
Index Cond: (i <= a.i)
CTE m
-> CTE Scan on q (cost=0.00..66666666.66 rows=3333333333 width=8)
(9 rows)
Above two queries essentially the same, but the second one is a
non-optimal plan. The reason is that how my patch works: it put a
substitution in front of SS_process_ctes():
/*
* If there is a WITH list, process each WITH query and build an initplan
! * SubPlan structure for it. Before we process ctes, try to subsitute with
! * subqueries to benefits from global optimization.
*/
if (parse->cteList)
+ {
+ substitute_ctes_with_subqueries(root);
SS_process_ctes(root);
+ }
AFAICS, the substitution only handles cteList within a query block, so
it does not go across the subquery boundary. I can see this is an
issue but can't see a nice way to fix it. Anybody has some recipe?
Regards,
Qingqing
Attachment | Content-Type | Size |
---|---|---|
ctes.patch | application/octet-stream | 8.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2015-08-26 22:23:42 | Re: Custom Scans and private data |
Previous Message | Fabrízio de Royes Mello | 2015-08-26 19:44:57 | Re: Is this a bug? |