vacuumdb --analyze-only (e.g., after a major upgrade) vs. partitioned tables: pg_statistic missing stats for the partitioned table itself

From: Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: vacuumdb --analyze-only (e.g., after a major upgrade) vs. partitioned tables: pg_statistic missing stats for the partitioned table itself
Date: 2024-10-24 21:48:42
Message-ID: CANNMO+K7m=k=T3kVtMQKwr1nC-7QpoEhH3_-DtwnuMqMMHdGVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Nikolay Samokhvalov <nik(at)postgres(dot)ai>
2:47 PM (0 minutes ago)
to pglsql-hackers
I just learned that vacuumdb --analyze-only doesn't update stats for the
partitioned table itself, taking care only about individual partitions:

(DDL doesn't matter here)

# vacuumdb --analyze-only -U postgres test --verbose
...
INFO: analyzing "public.measurement_2023_01"
INFO: "measurement_2023_01": scanned 6370 of 6370 pages, containing
1000000 live rows and 0 dead rows; 30000 rows in sample, 1000000 estimated
total rows
INFO: "measurement_2023_02": scanned 6257 of 6257 pages, containing 982279
live rows and 0 dead rows; 30000 rows in sample, 982279 estimated total rows
INFO: analyzing "public.measurement_2023_03"
INFO: "measurement_2023_03": scanned 6483 of 6483 pages, containing
1017721 live rows and 0 dead rows; 30000 rows in sample, 1017721 estimated
total rows
...

test=# select starelid::regclass, count(*) from pg_statistic where
starelid::regclass::text ~ 'measurement' group by 1 order by 1;
starelid | count
---------------------+-------
measurement_2023_01 | 4
measurement_2023_02 | 4
measurement_2023_03 | 4
(3 rows)

While for the single-threaded SQL-level ANALYZE:

test=# analyze verbose measurement;
...
test=# select starelid::regclass, count(*) from pg_statistic where
starelid::regclass::text ~ 'measurement' group by 1 order by 1;
starelid | count
---------------------+-------
measurement | 4
measurement_2023_01 | 4
measurement_2023_02 | 4
measurement_2023_03 | 4
(4 rows)

This means that if, after running pg_upgrade, we use vacuumdb to update
stats faster, some stats may be missing, potentially leading to suboptimal
performance.

Additionally, it doesn't help that pg_stat_all_tables doesn't show
counters/timestamps for partitioned table, even after SQL-level ANALYZE:

test=# select relname, analyze_count, autoanalyze_count, last_analyze,
last_autoanalyze from pg_stat_user_tables where relname ~ 'measurement';
relname | analyze_count | autoanalyze_count |
last_analyze | last_autoanalyze
---------------------+---------------+-------------------+-------------------------------+------------------
measurement_2023_01 | 2 | 0 | 2024-10-24
21:25:47.979958+00 |
measurement_2023_02 | 2 | 0 | 2024-10-24
21:25:48.070355+00 |
measurement_2023_03 | 2 | 0 | 2024-10-24
21:25:48.154613+00 |
(3 rows)

This is also discussed in
https://www.postgresql.org/message-id/flat/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA%40mail.gmail.com

I propose considering 3 fixes:

1) vacuumdb --analyze / --analyze-only to update stats for the partitioned
table, so people using pg_upgrade are not in trouble
2) present the ANALYZE metadata for partitioned tables in pg_stat_all_tables
3) for old versions, either backpatch with fix (1) OR just add to the docs
(and maybe to the final words pg_upgrade prints), suggesting something like
this in addition to vacuumdb analyze-only:

-- psql snippet
select format(
'analyze verbose %I.%I;',
relnamespace::oid::regnamespace,
oid::regclass
) as vacuum_command
from pg_class
where relkind = 'p' \gexec

Additionally, I do like the idea of ANALYZE ONLY from the -general
discussion above (though, there might be confusion with logic of --analyze
and --analyze-only in vacuumdb).

Does it make sense?

Nik

Browse pgsql-hackers by date

  From Date Subject
Next Message Rahila Syed 2024-10-24 21:55:10 Re: Using read_stream in index vacuum
Previous Message Greg Sabino Mullane 2024-10-24 19:51:04 Re: Trigger more frequent autovacuums of heavy insert tables