Re: pgsql: autovacuum: handle analyze for partitioned tables

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Yuzuko <yuzukohosoya(at)gmail(dot)com>
Subject: Re: pgsql: autovacuum: handle analyze for partitioned tables
Date: 2021-04-10 21:22:53
Message-ID: f8b09ecf-a590-c22b-b948-3d7dba7d30c3@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 4/10/21 12:29 AM, Justin Pryzby wrote:
> On Fri, Apr 09, 2021 at 06:16:59PM -0400, Alvaro Herrera wrote:
>> On 2021-Apr-09, Justin Pryzby wrote:
>>
>>> One data point: we do DETACH/ATTACH tables during normal operation, before
>>> type-promoting ALTERs, to avoid worst-case disk use, and to avoid locking the
>>> table for a long time. It'd be undesirable (but maybe of no great consequence)
>>> to trigger an ALTER when we DETACH them, since we'll re-ATTACH it shortly
>>> afterwards.
>>
>> You mean to trigger an ANALYZE, not to trigger an ALTER, right?
>
> Oops, right. It's slightly undesirable for a DETACH to cause an ANALYZE.
>
>> I think I agree with Tomas: we should do it by default, and offer some
>> way to turn that off. I suppose a new reloptions, solely for
>> partitioned tables, would be the way to do it.
>>
>>> However, I think DROP should be handled ?
>>
>> DROP of a partition? ... I would think it should do the same as DETACH,
>> right? Inform that however many rows the partition had, are now changed
>> in ancestors.
>
> Yes, drop of an (attached) partition. The case for DROP is clear, since it
> was clearly meant to go away forever. The case for DETACH seems somewhat less
> clear.
>
> The current behavior of pg_dump/restore (since 33a53130a) is to CREATE+ATTACH,
> so there's an argument that if DROPping the partition counts towards the
> parent's analyze, then so should CREATE+ATTACH.
>

I think it's tricky to "optimize" the behavior after ATTACH/DETACH. I'd
argue that in principle, we should aim to keep accurate statistics, so
ATTACH should be treated as insert of all rows, and DETACH should be
treated as delete of all rows. Se for the purpose of ANALYZE, we should
propagate reltuples as changes_since_analyze after ATTACH/DETACH.

Yes, it may result in more frequent ANALYZE on the parent, but I think
that's necessary. Repeated attach/detach of the same partition may bloat
the value, but I guess that's an example of "If it hurts don't do it."

What I think we might do is offer some light-weight analyze variant,
e.g. based on the merging of statistics (I've posted a PoC patch a
couple days ago.). That would make the ANALYZEs on parent much cheaper,
so those "unnecessary" analyzes would not be an issue.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Justin Pryzby 2021-04-11 01:55:37 Re: pgsql: autovacuum: handle analyze for partitioned tables
Previous Message Noah Misch 2021-04-10 19:03:42 pgsql: Standardize pg_authid oid_symbol values.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-04-10 21:57:43 Re: Reference Leak with type
Previous Message Zhihong Yu 2021-04-10 21:12:32 Re: Reference Leak with type