From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu> |
Cc: | "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Manuel Rigger <rigger(at)nus(dot)edu(dot)sg> |
Subject: | Re: Question about double table scans for a table |
Date: | 2023-07-27 20:38:35 |
Message-ID: | CAApHDvpBJprgHT0R6LkENsH78gAhegbf0Zv5XokXpF8LBYNE9A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, 27 Jul 2023 at 20:49, Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu> wrote:
> Both query plans include different numbers of table scans, as highlighted in red color. PostgreSQL uses six table scans, while TiDB has only three. I understand that the table scanning operation is expensive and query plans are typically more efficient with fewer table scans. My question is why PostgreSQL uses six table scans to scan each table twice? Is it a more efficient query plan, or does this indicate an optimization that is not performed by PostgreSQL?
The PostgreSQL planner does not do any scan deduplication like this.
You could likely write a query containing a WITH MATERALIZE that runs
the query with the GROUP BY, then reference the CTE in both the main
query and also the HAVING clause. e.g, something like:
explain with cte as materialized (select ps_partkey, sum(ps_supplycost
* ps_availqty) as cost from partsupp ... other joins...) select * from
cte having cost > (select sum(cost) from cte);
How much more efficient that'll be will depend on the number of distinct parts.
David
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2023-07-27 21:05:31 | BUG #18040: PostgreSQL does not report its version correctly |
Previous Message | Jeff Davis | 2023-07-27 18:21:25 | search_path not recomputed when role name changes |