Quick Links

Re: Horribly slow pg_upgrade performance with many Large Objects

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Hannu Krosing <hannuk(at)google(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Horribly slow pg_upgrade performance with many Large Objects
Date:	2025-04-11 20:11:53
Message-ID:	Z_l3iRo1FLNnAdRO@nathan
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Apr 08, 2025 at 12:22:00PM -0500, Nathan Bossart wrote:
> On Tue, Apr 08, 2025 at 01:07:09PM -0400, Tom Lane wrote:
>> Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
>>> I do think it's worth considering going back to copying
>>> pg_largobject_metadata's files for upgrades from v16 and newer.
>>
>> (If we do this) I don't see why we'd need to stop at v16. I'm
>> envisioning that we'd use COPY, which will be dealing in the
>> text representation of aclitems, and I don't think that's changed
>> in a long time. The sort of thing that would break it is changes
>> in the set of available/default privilege bits for large objects.
>
> I was thinking of actually reverting commit 12a53c7 for upgrades from v16,
> which AFAICT is the last release where any relevant storage formats changed
> (aclitem changed in v16). But if COPY gets us pretty close to that and is
> less likely to be disrupted by future changes, it could be a better
> long-term approach.
>
>> That is, where the dump currently contains something like
>>
>> SELECT pg_catalog.lo_create('2121');
>> ALTER LARGE OBJECT 2121 OWNER TO postgres;
>> GRANT ALL ON LARGE OBJECT 2121 TO joe;
>>
>> we'd have
>>
>> COPY pg_largeobject_metadata FROM STDIN;
>> ...
>> 2121 10 {postgres=rw/postgres,joe=rw/postgres}
>> ...
>>
>> and some appropriate COPY data for pg_shdepend too.

I did some more research here. For many large objects without ACLs to
dump, I noticed that the vast majority of time is going to restoring the
ALTER OWNER commands. For 1 million such large objects, restoring took ~73
seconds on my machine. If I instead invented an lo_create_with_owner()
function and created 100 per SELECT command, the same restore takes ~7
seconds. Copying the relevant pg_shdepend rows out and back in takes ~2.5
seconds. I imagine using COPY for pg_largeobject_metadata would also take
a couple of seconds in this case.

For upgrading, I don't think there's any huge benefit to optimizing the
restore commands versus using COPY. It might make future catalog changes
for large object stuff easier, but I'd expect those to be rare. However,
the optimized restore commands could be nice for non-pg_upgrade use-cases.

--
nathan

In response to

Re: Horribly slow pg_upgrade performance with many Large Objects at 2025-04-08 17:22:00 from Nathan Bossart

Responses

Re: Horribly slow pg_upgrade performance with many Large Objects at 2025-04-13 20:17:59 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniel Gustafsson	2025-04-11 20:23:17	Re: Fixing various typos in comments and docs
Previous Message	Sami Imseih	2025-04-11 19:33:35	Re: n_ins_since_vacuum stats for aborted transactions