Quick Links

More performance improvements for pg_dump in binary upgrade mode

From:	Daniel Gustafsson <daniel(at)yesql(dot)se>
To:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
Subject:	More performance improvements for pg_dump in binary upgrade mode
Date:	2024-05-15 20:15:13
Message-ID:	8F1F1E1D-D17B-4B33-B014-EDBCD15F3F0B@yesql.se
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Prompted by an off-list bugreport of pg_upgrade hanging (which turned out to be
slow enough to be perceived to hang) for large schemas I had a look at pg_dump
performance during --binary-upgrade mode today. My initial take was to write
more or less exactly what Nathan did in [0], only to realize that it was a)
already proposed and b) I had even reviewed it. Doh.

The next attempt was to reduce more per-object queries from binary upgrade, and
the typarray lookup binary_upgrade_set_type_oids_by_type_oid seemed like a good
candidate for a cache lookup. Since already cache type TypeInfo objects, if we
add typarray to TypeInfo we can use the existing lookup code.

As a baseline, pg_dump dumps a synthetic workload of 10,000 (empty) relations
with a width of 1-10 columns:

$ time ./bin/pg_dump --schema-only --quote-all-identifiers --format=custom \
--file a postgres > /dev/null

real 0m1.256s
user 0m0.273s
sys 0m0.059s

The same dump in binary upgrade mode runs significantly slower:

$ time ./bin/pg_dump --schema-only --quote-all-identifiers --binary-upgrade \
--format=custom --file a postgres > /dev/null

real 1m9.921s
user 0m0.782s
sys 0m0.436s

With the typarray caching from the patch attached here added:

$ time ./bin/pg_dump --schema-only --quote-all-identifiers --binary-upgrade \
--format=custom --file b postgres > /dev/null

real 0m45.210s
user 0m0.655s
sys 0m0.299s

With the typarray caching from the patch attached here added *and* Nathan's
patch from [0] added:

$ time ./bin/pg_dump --schema-only --quote-all-identifiers --binary-upgrade \
--format=custom --file a postgres > /dev/null

real 0m1.566s
user 0m0.309s
sys 0m0.080s

The combination of these patches thus puts binary uphrade mode almost on par
with a plain dump, which has the potential to make upgrades of large schemas
faster. Parallel-parking this patch with Nathan's in the July CF, just wanted
to type it up while it was fresh in my mind.

--
Daniel Gustafsson

[0] https://commitfest.postgresql.org/48/4936/

Attachment	Content-Type	Size
0001-Cache-typarray-for-fast-lookups-in-binary-upgrade-mo.patch	application/octet-stream	3.4 KB

Responses

Re: More performance improvements for pg_dump in binary upgrade mode at 2024-05-15 20:21:36 from Nathan Bossart

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David G. Johnston	2024-05-15 20:17:57	Re: add function argument names to regex* functions.
Previous Message	David G. Johnston	2024-05-15 20:12:57	Re: add function argument names to regex* functions.