From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org |
Subject: | Re: Statistics Import and Export |
Date: | 2025-02-21 21:24:38 |
Message-ID: | 3670503.1740173078@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> Looking at the query log, the biggest culprit is a *lot* of additional
> queries, I think primarily these two:
> SELECT c.oid::regclass AS relation, current_setting('server_version_num') AS version, c.relpages, c.reltuples, c.relallvisible FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE n.nspname = 'public' AND c.relname = 'alpha_neg_p2'
> SELECT c.oid::regclass AS relation, s.attname,s.inherited,current_setting('server_version_num') AS version, s.null_frac,s.avg_width,s.n_distinct,s.most_common_vals,s.most_common_freqs,s.histogram_bounds,s.correlation,s.most_common_elems,s.most_common_elem_freqs,s.elem_count_histogram,s.range_length_histogram,s.range_empty_frac,s.range_bounds_histogram FROM pg_stats s JOIN pg_namespace n ON n.nspname = s.schemaname JOIN pg_class c ON c.relname = s.tablename AND c.relnamespace = n.oid WHERE s.schemaname = 'public' AND s.tablename = 'alpha_neg_p2' ORDER BY s.attname, s.inherited
Oy. Those are outright horrid, even without any consideration of
pre-preparing them. We know the OID of the table we want to dump,
we should be doing "FROM pg_class WHERE oid = whatever" and lose
the join to pg_namespace altogether. The explicit casts to regclass
are quite expensive too to fetch information that pg_dump already
has. It already knows the server version, too.
Moreover, the first of these shouldn't be a separate query at all.
I objected to fetching pg_statistic content for all tables at once,
but relpages/reltuples/relallvisible is a pretty small amount of
new info. We should just collect those fields as part of getTables'
main query of pg_class (which, indeed, is already fetching relpages).
On the second one, if we want to go through the pg_stats view then
we can't rely on table OID, but I don't see why we need the joins
to anything else. "WHERE s.schemaname = 'x' AND s.tablename = 'y'"
seems sufficient.
I wonder whether we ought to issue different queries depending on
whether we're superuser. The pg_stats view is rather expensive
because of its security restrictions, and if we're superuser we
could just look directly at pg_statistic. Maybe those checks are
fast enough not to matter, but ...
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2025-02-21 21:44:28 | Re: explain analyze rows=%.0f |
Previous Message | Robert Haas | 2025-02-21 21:20:08 | Re: explain analyze rows=%.0f |