Re: document the need to analyze partitioned tables

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, yuzuko <yuzukohosoya(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: document the need to analyze partitioned tables
Date: 2022-01-21 18:31:38
Message-ID: 19c248d9-3687-9e0f-0b1a-2adad1de6fbe@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/21/22 19:02, Justin Pryzby wrote:
> Thanks for looking at this
>
> On Fri, Jan 21, 2022 at 06:21:57PM +0100, Tomas Vondra wrote:
>> Hi,
>>
>> On 10/8/21 14:58, Justin Pryzby wrote:
>>> Cleaned up and attached as a .patch.
>>>
>>> The patch implementing autoanalyze on partitioned tables should
>>> revert relevant portions of this patch.
>>
>> I went through this patch and I'd like to propose a couple changes, per the
>> 0002 patch:
>>
>> 1) I've reworded the changes in maintenance.sgml a bit. It sounded a bit
>> strange before, but I'm not a native speaker so maybe it's worse ...
>
> + autoanalyze on the parent table. If your queries require statistics on
> + parent relations for proper planning, it's necessary to periodically run
>
> You added two references to "relations", but everything else talks about
> "tables", which is all that analyze processes.
>

Good point, that should use "tables" too.

>> 2) Remove unnecessary whitespace changes in perform.sgml.
>
> Those were a note to myself and to any reviewer - should that be updated too ?
>

Ah, I see. I don't think that part needs updating - it talks about
having to analyze after a bulk load, and that applies to all tables
anyway. I don't think it needs to mention partitioned tables need an
analyze too.

>> 3) Simplify the analyze.sgml changes a bit - it was trying to cram too much
>> stuff into a single paragraph, so I split that.
>>
>> Does that seem OK, or did omit something important?
>
> + If the table being analyzed has one or more children,
>
> I think you're referring to both legacy inheritance and and partitioning. That
> should be more clear.
>

I think it applies to both types of partitioning - it's just that in the
declarative partitioning case the table is always empty so no stats with
inherit=false are built.

> + <command>ANALYZE</command> gathers two sets of statistics: once on the rows
> + of the parent table only, and a second one including rows of both the parent
> + table and all child relations. This second set of statistics is needed when
>
> I think should say ".. and all of its children".
>

OK

>> FWIW I think it's really confusing we have inheritance and partitioning, and
>> partitions and child tables. And sometimes we use partitioning in the
>> generic sense (i.e. including the inheritance approach), and sometimes only
>> the declarative variant. Same for partitions vs child tables. I can't even
>> imagine how confusing this has to be for people just learning this stuff.
>> They must be in permanent WTF?! state ...
>
> The docs were cleaned up some in 0c06534bd. At least the word "partitioned"
> should never be used for legacy inheritance - but "partitioning" is.
>

OK

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
v3-0001-documentation-deficiencies-for-ANALYZE-of-partiti.patch text/x-patch 5.9 KB
v3-0002-minor-changes-rewordings.patch text/x-patch 5.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-01-21 18:33:15 Re: refactoring basebackup.c
Previous Message Justin Pryzby 2022-01-21 18:02:00 Re: document the need to analyze partitioned tables