From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org |
Subject: | Re: Statistics Import and Export |
Date: | 2025-01-20 08:21:35 |
Message-ID: | CACJufxGuRPfd70NwogkRE26cZcKGrR9gjG4OiH18tu=BnyezHA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 16, 2025 at 4:56 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>>
>>
>>
>> I do like the idea of a "Statistics for ..." prefix, and I think it's
>> doable.
>
>
> And that's now implemented.
>
>
>> The caller needs some knowledge about that anyway, to correctly output
>> the statistics dump when the schema is not requested. Tests should
>> cover those cases, too.
>
>
> Tests for pg_dump --no-statistics and pg_dump --schema-only were added. Rebased to master as of today.
>
> I'm not completely happy with this patch, as I had to comment out one check in pg_backup_archiver that seemed necessary, but perhaps another set of eyes will set me straight.
>
> Attached is just the pg_dump stuff, and only for relation/attribute stats. The extended stats and vacuumdb work will be in their own threads going forward.
hi
The current v38 implementation allows statistics to be placed in
either SECTION_DATA or SECTION_POST_DATA.
IMHO, moving all statistics to the SECTION_POST_DATA section would
simplify things.
Attached is a patch that implements this change, (based on your patch,
obviously)
Reasoning making statistics in SECTION_POST_DATA are:
* statistics in multi sections will make the --section
handle statistics more harder for pg_dump and pg_restore.
* current doc in --schema-only says
"It is similar to, but for historical reasons not identical to, specifying
--section=pre-data --section=post-data.".
section span two sections making this sentence not less accurate.
Also, if we want to use --section option to dump all the statistics, we
need to use --data and --post-data together to get all the statistics.
* generally, --post-data section takes less time then --data section,
so putting it in SECTION_POST_DATA won't "cost" us that much.
* repairDependencyLoop, repairMatViewBoundaryMultiLoop is quite hard
to comprehend.
make statistics in SECTION_POST_DATA can make use not to think about
these changes.
also seems there is no materialized view statistics dump and restore
tests in src/bin/pg_dump/t/002_pg_dump.pl
* There are many REQ_DATA or REQ_SCHEMA occurrences, for each
occurrence, we may need to consider REQ_STATS.
make the statistics in SECTION_POST_DATA, then we don't need REQ_STATS.
what do you think?
This issue [1] seems not yet resolved.
[1] https://postgr.es/m/Z22kX5x2IhNb8kHE@momjian.us
even if pg_upgrade has options --with-statistics and --with-statistics.
we still need a sentence to mention which option is default?
Attachment | Content-Type | Size |
---|---|---|
v38-0001-make-statistics-dumped-at-SECTION_POST_DATA.no-cfbot | application/octet-stream | 51.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2025-01-20 08:26:11 | Re: Re: proposal: schema variables |
Previous Message | Nisha Moond | 2025-01-20 08:03:54 | Re: Introduce XID age and inactive timeout based replication slot invalidation |