From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org |
Subject: | Re: Statistics Import and Export |
Date: | 2025-04-01 02:33:15 |
Message-ID: | Z-tQa5zsVkcCyYin@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 31, 2025 at 11:11:47AM -0400, Corey Huinker wrote:
> In light of v11-0001 being committed as 4694aedf63bf, I've rebased the
> remaining patches.
I spent the day preparing these for commit. A few notes:
* I've added a new prerequisite patch that skips the second WriteToc() call
for custom-format dumps that do not include data. After some testing and
code analysis, I haven't identified any examples where this produces
different output. This doesn't help much on its own, but it will become
rather important when we move the attribute statistics queries to happen
within WriteToc() in 0002.
* I was a little worried about the correctness of 0002 for dumps that run
the attribute statistics queries twice, but I couldn't identify any
problems here either.
* I removed a lot of miscellaneous refactoring that seemed unnecessary for
these patches. Let's move that to another patch set and keep these as
simple as possible.
* I made a small adjustment to the TOC scan restarting logic in
fetchAttributeStats(). Specifically, we now only allow the scan to
restart once for custom-format dumps that include data.
* While these patches help decrease pg_dump's memory footprint, I believe
pg_restore still reads the entire TOC into memory. That's not this patch
set's problem, but I think it's still an important consideration for the
bigger picture.
Regarding whether pg_dump should dump statistics by default, my current
thinking is that it shouldn't, but I think we _should_ have pg_upgrade
dump/restore statistics by default because that is arguably the most
important use-case. This is more a gut feeling than anything, so I reserve
the right to change my opinion.
My goal is to commit the attached patches on Friday morning, but of course
that is subject to change based on any feedback or objections that emerge
in the meantime.
--
nathan
Attachment | Content-Type | Size |
---|---|---|
v12n-0001-Skip-second-WriteToc-for-custom-format-dumps-wi.patch | text/plain | 1.6 KB |
v12n-0002-pg_dump-Reduce-memory-usage-of-dumps-with-stati.patch | text/plain | 8.1 KB |
v12n-0003-pg_dump-Batch-queries-for-retrieving-attribute-.patch | text/plain | 8.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Treat | 2025-04-01 03:02:49 | Re: Statistics Import and Export |
Previous Message | Junwang Zhao | 2025-04-01 02:27:24 | Re: general purpose array_sort |