From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, yuzuko <yuzukohosoya(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: document the need to analyze partitioned tables |
Date: | 2023-01-12 23:27:47 |
Message-ID: | 20230112232747.GA2111950@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 05, 2022 at 10:37:01AM +0200, Laurenz Albe wrote:
> On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:
>> I've pushed the last version, and backpatched it to 10 (not sure I'd
>> call it a bugfix, but I certainly agree with Justin it's worth
>> mentioning in the docs, even on older branches).
>
> I'd like to suggest an improvement to this. The current wording could
> be read to mean that dead tuples won't get cleaned up in partitioned tables.
Well, dead tuples won't get cleaned up in partitioned tables, as
partitioned tables do not have storage. But I see what you mean. Readers
might misinterpret this to mean that autovacuum will not process the
partitions. There's a good definition of what the docs mean by
"partitioned table" [0], but FWIW it took me some time before I
consistently read "partitioned table" to mean "only the thing with relkind
set to 'p'" and not "both the partitioned table and its partitions." So,
while the current wording it technically correct, I think it'd be
reasonable to expand it to help avoid confusion.
Here is my take on the wording:
Since all the data for a partitioned table is stored in its partitions,
autovacuum does not process partitioned tables. Instead, autovacuum
processes the individual partitions that are regular tables. This
means that autovacuum only gathers statistics for the regular tables
that serve as partitions and not for the partitioned tables. Since
queries may rely on a partitioned table's statistics, you should
collect statistics via the ANALYZE command when it is first populated,
and again whenever the distribution of data in its partitions changes
significantly.
[0] https://www.postgresql.org/docs/devel/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-01-13 00:33:14 | Re: Beautify pg_walinspect docs a bit |
Previous Message | Jacob Champion | 2023-01-12 23:19:19 | Re: [EXTERNAL] Re: Support load balancing in libpq |