Quick Links

Re: Statistics Import and Export

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, alvherre(at)alvh(dot)no-ip(dot)org
Subject:	Re: Statistics Import and Export
Date:	2024-08-23 20:49:52
Message-ID:	11bb2c7403d4a7421fe8b09df6737aa83976266e.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, 2024-08-15 at 20:53 -0400, Corey Huinker wrote:
>
> > * Remind me why the new stats completely replace the new row,
> > rather
> > than updating only the statistic kinds that are specified?
>
> because:
> - complexity

I don't think it significantly impacts the overall complexity. We have
a ShareUpdateExclusiveLock on the relation, so there's no concurrency
to deal with, and an upsert operation is not many more lines of code.

> - we would then need a mechanism to then tell it to *delete* a
> stakind

That sounds useful regardless. I have introduced pg_clear_*_stats()
functions.

> - we'd have to figure out how to reorder the remaining stakinds, or
> spend effort finding a matching stakind in the existing row to know
> to replace it

Right. I initialized the values/nulls arrays based on the existing
tuple, if any, and created a set_stats_slot() function that searches
for either a matching stakind or the first empty slot.

> - "do what analyze does" was an initial goal and as a result many
> test cases directly compared pg_statistic rows from an original table
> to an empty clone table to see if the "copy" had fidelity.

Can't we just clear the stats first to achieve the same effect?

I have attached version 28j as one giant patch covering what was
previously 0001-0003. It's a bit rough (tests in particular need some
work), but it implelements the logic to replace only those values
specified rather than the whole tuple.

At least for the interactive "set" variants of the functions, I think
it's an improvement. It feels more natural to just change one stat
without wiping out all the others. I realize a lot of the statistics
depend on each other, but the point is not to replace ANALYZE, the
point is to experiment with planner scenarios. What do others think?

For the "restore" variants, I'm not sure it matters a lot because the
stats will already be empty. If it does matter, we could pretty easily
define the "restore" variants to wipe out existing stats when loading
the table, though I'm not sure if that's a good thing or not.

I also made more use of FunctionCallInfo structures to communicate
between functions rather than huge parameter lists. I believe that
reduced the line count substantially, and made it easier to transform
the argument pairs in the "restore" variants into the positional
arguments for the "set" variants.

Regards,
Jeff Davis

Attachment	Content-Type	Size
v28j-0001-Create-functions-pg_set_relation_stats-pg_clear.patch	text/x-patch	113.7 KB

In response to

Re: Statistics Import and Export at 2024-08-16 00:53:37 from Corey Huinker

Responses

Re: Statistics Import and Export at 2024-08-27 03:35:00 from jian he
Re: Statistics Import and Export at 2024-09-05 17:29:44 from Corey Huinker

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joel Jacobson	2024-08-23 21:03:56	Re: Optimising numeric division
Previous Message	Robert Haas	2024-08-23 20:31:54	Re: pg_verifybackup: TAR format backup verification