Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject: Re: Statistics Import and Export
Date: 2025-02-21 22:09:42
Message-ID: CADkLM=d9DapVDPXB+c_FFBbwvXCOXrjLNqP7RUhBpsT1Kr=Hfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Oy. Those are outright horrid, even without any consideration of
> pre-preparing them. We know the OID of the table we want to dump,
> we should be doing "FROM pg_class WHERE oid = whatever" and lose
> the join to pg_namespace altogether. The explicit casts to regclass
> are quite expensive too to fetch information that pg_dump already
> has. It already knows the server version, too.
>

+1
Earlier versions had prepared statements, but those were removed to keep
things simple. Easy enough to revive.

>
> Moreover, the first of these shouldn't be a separate query at all.
> I objected to fetching pg_statistic content for all tables at once,
> but relpages/reltuples/relallvisible is a pretty small amount of
> new info. We should just collect those fields as part of getTables'
> main query of pg_class (which, indeed, is already fetching relpages).
>

+1

> On the second one, if we want to go through the pg_stats view then
> we can't rely on table OID, but I don't see why we need the joins
> to anything else. "WHERE s.schemaname = 'x' AND s.tablename = 'y'"
> seems sufficient.
>

+1

>
> I wonder whether we ought to issue different queries depending on
> whether we're superuser. The pg_stats view is rather expensive
> because of its security restrictions, and if we're superuser we
> could just look directly at pg_statistic. Maybe those checks are
> fast enough not to matter, but ...

That could lead to a rather complicated query that has to replicate the
guts of pg_stats for every server-specific version of pg_stats,
specifically the CASE statements that transform
the stakindN/stanumbersN/stavaluesN to mcv, correlation, etc, so I'd like
to avoid that if possible.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-02-21 22:20:48 Re: Statistics Import and Export
Previous Message Melanie Plageman 2025-02-21 22:00:49 Re: BitmapHeapScan streaming read user and prelim refactoring