From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, yuzuko <yuzukohosoya(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Amit Langote <amitlangote09(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu> |
Subject: | Re: Autovacuum on partitioned table (autoanalyze) |
Date: | 2021-08-17 10:49:50 |
Message-ID: | 20210817104950.bkvv2cvttylpn4j3@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2021-08-16 17:42:48 -0400, Álvaro Herrera wrote:
> On 2021-Aug-16, Álvaro Herrera wrote:
>
> > Here's the reversal patch for the 14 branch. (It applies cleanly to
> > master, but the unused member of PgStat_StatTabEntry needs to be
> > removed and catversion bumped).
>
> I have pushed this to both branches. (I did not remove the item from
> the release notes in the 14 branch.)
>
> It upsets me to have reverted it, but after spending so much time trying
> to correct the problems, I believe it just wasn't salvageable within the
> beta-period code freeze constraints.
:(
> I described the issues I ran into
> in earlier messages; I think a good starting point to re-develop this is
> to revert the reversal commit, then apply my patch at
> https://postgr.es/m/0794d7ca-5183-486b-9c5e-6d434867cecd@www.fastmail.com
> then do something about the remaining problems that were complained
> about. (Maybe: add an "ancestor OID" member to PgStat_StatTabEntry so
> that the collector knows to propagate counts from children to ancestors
> when the upd/ins/del counts are received.
My suspicion is that it'd be a lot easier to implement this efficiently if
there were no propagation done outside of actually analyzing tables. I.e. have
do_autovacuum() build a hashtable of (parent_table_id, count) and use that to
make the analyze decisions. And then only propagate up the costs to parents of
tables when a child is analyzed (and thus looses its changes_since_analyze)
value. Then we can use hashtable_value + changes_since_analyze for
partitioning decisions of partitioned tables.
I've prototyped this, and it does seem to make do_autovacuum() cheaper. I've
attached that prototype, but note it's in a rough state.
However, unless we change the way inheritance parents are stored, it still
requires repetitive get_partition_ancestors() (or get_partition_parent())
calls in do_autovacuum(), which I think is problematic due to the index scans
you pointed out as well. The obvious way to address that would be to store
parent oids in pg_class - I suspect duplicating parents in pg_class is the
best way out, but pretty it is not.
> However, consider developing it as follow-up to Horiguchi-san's shmem
> pgstat rather than current pgstat implementation.)
+1
It might be worth to first tackle reusing samples from a relation's children
when building inheritance stats. Either by storing the samples somewhere (not
cheap) and reusing them, or by at least updating a partition's stats when
analyzing the parent.
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
autovac-partitioned-via-hash.diff | text/x-diff | 19.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2021-08-17 10:50:22 | Re: Autovacuum on partitioned table (autoanalyze) |
Previous Message | Amit Kapila | 2021-08-17 10:08:44 | Re: Added schema level support for publication. |