Quick Links

Re: Statistics Import and Export

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject:	Re: Statistics Import and Export
Date:	2025-02-26 04:36:09
Message-ID:	1704472.1740544569@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Corey Huinker <corey(dot)huinker(at)gmail(dot)com> writes:
> My solution so far is to take allo the v11+ (SELECT array_agg...) functions
> and put them into a LATERAL, two of them filtered by attstattarget > 0 and
> a new one aggregating attnames with no filter.

> An alternative would be a new subselect for array_agg(attname) WHERE
> in.indexprs IS NOT NULL, thus removing the extra compute for the indexes
> that lack an index expression (i.e. most of them), and thus lack settable
> stats (at least for now) and wouldn't be affected by the name-jitter issue
> anyway.

Yeah, I've been thinking about that. I think that the idea of the
current design is that relatively few indexes will have explicit stats
targets set on them, so most of the time the sub-SELECTs produce no
data. (Which is not to say that they're cheap to execute.) If we
pull all the column names for all indexes then we'll likely bloat
pg_dump's working storage quite a bit. Pulling them only for indexes
with expression columns should fix that, and as you say we don't need
the names otherwise.

I still fear that those sub-selects are pretty expensive in aggregate
-- they are basically forcing a nestloop join -- and maybe we need to
rethink that whole idea.

BTW, just as a point of order: it is not the case that non-expression
indexes are free of name-jitter problems. That's because we don't
bother to rename index columns when the underlying table column is
renamed, thus:

After dump-n-reload, this index's column will be named "xx".
That's not relevant to our current problem as long as we
don't store stats on such index columns, but it's plenty
relevant to the ALTER INDEX ... SET STATISTICS code.

> I'm on the fence about how to handle pg_clear_attribute_stats(), leaning
> toward overloaded functions.

I kinda felt that we didn't need to bother with an attnum-based
variant of pg_clear_attribute_stats(), since pg_dump has no
use for that. I won't stand in the way if you're desperate to
do it, though.

regards, tom lane

In response to

Re: Statistics Import and Export at 2025-02-26 04:05:57 from Corey Huinker

Responses

Re: Statistics Import and Export at 2025-02-26 04:57:40 from Corey Huinker

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2025-02-26 04:46:19	Re: Log connection establishment timings
Previous Message	Sagar Shedge	2025-02-26 04:09:13	Re: Extend postgres_fdw_get_connections to return remote backend pid