From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Subject: | Re: Statistics Import and Export |
Date: | 2024-03-06 11:06:28 |
Message-ID: | CAEze2WidTCpSkPaL_R=ojmmAyYq_RHSG+dS27upbfF__5KGKZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 6 Mar 2024 at 11:33, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> On Wed, Mar 6, 2024 at 11:07 Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
>> Or even just one VALUES for the whole statistics loading?
>
>
> I don’t think we’d want to go beyond one relation at a time as then it can be parallelized, we won’t be trying to lock a whole bunch of objects at once, and any failures would only impact that one relation’s stats load.
That also makes sense.
>> I suspect the main issue with combining this into one statement
>> (transaction) is that failure to load one column's statistics implies
>> you'll have to redo all the other statistics (or fail to load the
>> statistics at all), which may be problematic at the scale of thousands
>> of relations with tens of columns each.
>
>
> I’m pretty skeptical that “stats fail to load and lead to a failed transaction” is a likely scenario that we have to spend a lot of effort on.
Agreed on the "don't have to spend a lot of time on it", but I'm not
so sure on the "unlikely" part while the autovacuum deamon is
involved, specifically for non-upgrade pg_restore. I imagine (haven't
checked) that autoanalyze is disabled during pg_upgrade, but
pg_restore doesn't do that, while it would have to be able to restore
statistics of a table if it is included in the dump (and the version
matches).
> What are the cases where we would be seeing stats reloads failing where it would make sense to re-try on a subset of columns, or just generally, if we know that the pg_dump version matches the target server version?
Last time I checked, pg_restore's default is to load data on a
row-by-row basis without --single-transaction or --exit-on-error. Of
course, pg_upgrade uses it's own set of flags, but if a user is
restoring stats with pg_restore, I suspect they'd rather have some
column's stats loaded than no stats at all; so I would assume this
requires one separate pg_import_pg_statistic()-transaction for every
column.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-03-06 11:19:04 | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Previous Message | Amit Kapila | 2024-03-06 10:58:27 | Re: Introduce XID age and inactive timeout based replication slot invalidation |