From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | Zhihong Yu <zyu(at)yugabyte(dot)com>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: extended stats on partitioned tables |
Date: | 2021-12-12 21:32:10 |
Message-ID: | 20211212213210.GP17618@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Dec 12, 2021 at 05:17:10AM +0100, Tomas Vondra wrote:
> The one thing bugging me a bit is that the regression test checks only a
> GROUP BY query. It'd be nice to add queries testing MCV/dependencies
> too, but that seems tricky because most queries will use per-partitions
> stats.
You mean because the quals are pushed down to the scan node.
Does that indicate a deficiency ?
If extended stats are collected for a parent table, selectivity estimates based
from the parent would be better; but instead we use uncorrected column
estimates from the child tables.
From what I see, we could come up with a way to avoid the pushdown, involving
volatile functions/foreign tables/RLS/window functions/SRF/wholerow vars/etc.
But would it be better if extended stats objects on partitioned tables were to
collect stats for both parent AND CHILD ? I'm not sure. Maybe that's the
wrong solution, but maybe we should still document that extended stats on
(empty) parent tables are often themselves not used/useful for selectivity
estimates, and the user should instead (or in addition) create stats on child
tables.
Or, maybe if there's no extended stats on the child tables, stats on the parent
table should be consulted ?
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2021-12-12 21:49:15 | Re: extended stats on partitioned tables |
Previous Message | Thomas Munro | 2021-12-12 21:31:29 | Re: Windows now has fdatasync() |