Quick Links

Re: pg_upgrade failing for 200+ million Large Objects

From:	"Kumar, Sachin" <ssetiya(at)amazon(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jacob Champion <champion(dot)p(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Robins Tharakan <tharakan(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_upgrade failing for 200+ million Large Objects
Date:	2023-12-04 16:07:59
Message-ID:	83D44BE5-0088-4D41-8AE6-20A05D026F46@amazon.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:

> FWIW, I agree with Jacob's concern about it being a bad idea to let
> users of pg_upgrade pass down arbitrary options to pg_dump/pg_restore.
> I think we'd regret going there, because it'd hugely expand the set
> of cases pg_upgrade has to deal with.

> Also, pg_upgrade is often invoked indirectly via scripts, so I do
> not especially buy the idea that we're going to get useful control
> input from some human somewhere. I think we'd be better off to
> assume that pg_upgrade is on its own to manage the process, so that
> if we need to switch strategies based on object count or whatever,
> we should put in a heuristic to choose the strategy automatically.
> It might not be perfect, but that will give better results for the
> pretty large fraction of users who are not going to mess with
> weird little switches.

I have updated the patch to use heuristic, During pg_upgrade we count
Large objects per database. During pg_restore execution if db large_objects
count is greater than LARGE_OBJECTS_THRESOLD (1k) we will use
--restore-blob-batch-size.
I also modified pg_upgrade --jobs behavior if we have large_objects (> LARGE_OBJECTS_THRESOLD)

+ /* Restore all the dbs where LARGE_OBJECTS_THRESOLD is not breached */
+ restore_dbs(stats, true);
+ /* reap all children */
+ while (reap_child(true) == true)
+ ;
+ /* Restore rest of the dbs one by one with pg_restore --jobs = user_opts.jobs */
+ restore_dbs(stats, false);
/* reap all children */
while (reap_child(true) == true)
;

Regards
Sachin

Attachment	Content-Type	Size
pg_upgrade_improvements_v7.diff	application/octet-stream	27.9 KB

In response to

Re: pg_upgrade failing for 200+ million Large Objects at 2023-11-09 18:41:01 from Tom Lane

Responses

Re: pg_upgrade failing for 200+ million Large Objects at 2023-12-07 14:05:13 from Kumar, Sachin

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ashutosh Bapat	2023-12-04 16:18:06	Re: Proposal: In-flight explain logging
Previous Message	Joe Conway	2023-12-04 15:45:58	Re: Emitting JSON to file using COPY TO