From: | yuzuko <yuzukohosoya(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | David Steele <david(at)pgmasters(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Amit Langote <amitlangote09(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu> |
Subject: | Re: Autovacuum on partitioned table (autoanalyze) |
Date: | 2021-04-01 11:34:53 |
Message-ID: | CAKkQ508EzAG9ib8rRSNozuYhB1e6p_4FzwrG=YRTW2fENT2+BA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Tomas,
Thank you for reviewing the patch.
> Firstly, the patch propagates the changes_since_analyze values from
> do_analyze_rel, i.e. from the worker after it analyzes the relation.
> That may easily lead to cases with unnecessary analyzes - consider a
> partitioned with 4 child relations:
> [ explanation ]
>
I didn't realize that till now. Indeed, this approach increments parent's
changes_since_analyze counter according to its leaf partition's counter
when the leaf partition is analyzed, so it will cause unnecessary ANALYZE
on partitioned tables as you described.
> I propose a different approach - instead of propagating the counts in
> do_analyze_rel for individual leaf tables, let's do that in bulk in
> relation_needs_vacanalyze. Before the (existing) first pass over
> pg_class, we can add a new one, propagating counts from leaf tables to
> parents.
>
Thank you for your suggestion. I think it could solve all the issues
you mentioned. I modified the patch based on this approach:
- Create a new counter, PgStat_Counter changes_since_analyze_reported,
to track changes_since_analyze we already propagated to ancestors.
This is used for internal processing and users may not need to know it.
So this counter is not displayed at pg_stat_all_tables view for now.
- Create a new function, pgstat_propagate_changes() which propagates
changes_since_analyze counter to all ancestors and saves
changes_since_analyze_reported. This function is called in
do_autovacuum() before relation_needs_vacanalyze().
> Note: I do have some ideas about how to improve that, I've started a
> separate thread about it [1].
>
I'm also interested in merging children's statistics for partitioned tables
because it will make ANALYZE on inheritance trees more efficient.
So I'll check it later.
> I forgot to mention one additional thing yesterday - I wonder if we need
> to do something similar after a partition is attached/detached. That can
> also change the parent's statistics significantly, so maybe we should
> handle all partition's rows as changes_since_analyze? Not necessarily
> something this patch has to handle, but might be related.
>
Regarding attached/detached partitions, I think we should update statistics
of partitioned tables according to the new inheritance tree. The latest patch
hasn't handled this case yet, but I'll give it a try soon.
Attach the v13 patch to this email. Could you please check it again?
--
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v13_autovacuum_on_partitioned_table.patch | application/octet-stream | 21.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2021-04-01 12:02:09 | Re: Failed assertion on standby while shutdown |
Previous Message | houzj.fnst@fujitsu.com | 2021-04-01 10:41:30 | RE: Hybrid Hash/Nested Loop joins and caching results from subplans |