pgsql: Speed up rechecking if relation needs to be vacuumed or analyze

From: Fujii Masao <fujii(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Speed up rechecking if relation needs to be vacuumed or analyze
Date: 2020-12-08 15:00:44
Message-ID: E1kmeTs-0000Iw-5f@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Speed up rechecking if relation needs to be vacuumed or analyze in autovacuum.

After autovacuum collects the relations to vacuum or analyze, it rechecks
whether each relation still needs to be vacuumed or analyzed before actually
doing that. Previously this recheck could be a significant overhead
especially when there were a very large number of relations. This was
because each recheck forced the statistics to be refreshed, and the refresh
of the statistics for a very large number of relations could cause heavy
overhead. There was the report that this issue caused autovacuum workers
to have gotten “stuck” in a tight loop of table_recheck_autovac() that
rechecks whether a relation needs to be vacuumed or analyzed.

This commit speeds up the recheck by making autovacuum worker reuse
the previously-read statistics for the recheck if possible. Then if that
"stale" statistics says that a relation still needs to be vacuumed or analyzed,
autovacuum refreshes the statistics and does the recheck again.

The benchmark shows that the more relations exist and autovacuum workers
are running concurrently, the more this change reduces the autovacuum
execution time. For example, when there are 20,000 tables and 10 autovacuum
workers are running, the benchmark showed that the change improved
the performance of autovacuum more than three times. On the other hand,
even when there are only 1000 tables and only a single autovacuum worker
is running, the benchmark didn't show any big performance regression by
the change.

Firstly POC patch was proposed by Jim Nasby. As the result of discussion,
we used Tatsuhito Kasahara's version of the patch using the approach
suggested by Tom Lane.

Reported-by: Jim Nasby
Author: Tatsuhito Kasahara
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/3FC6C2F2-8A47-44C0-B997-28830B5716D0@amazon.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/e2ac3fed3b1c3281281eb530c220634030cd8084

Modified Files
--------------
src/backend/postmaster/autovacuum.c | 108 +++++++++++++++++++++++++++++-------
1 file changed, 89 insertions(+), 19 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2020-12-08 17:06:38 pgsql: Doc: explain that the string types can't store \0 (ASCII NUL).
Previous Message Fujii Masao 2020-12-08 12:06:23 pgsql: Bump catversion for pg_stat_wal changes.