| From: | "Kumar, Sachin" <ssetiya(at)amazon(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Jacob Champion <champion(dot)p(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Robins Tharakan <tharakan(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: pg_upgrade failing for 200+ million Large Objects | 
| Date: | 2023-12-04 16:07:59 | 
| Message-ID: | 83D44BE5-0088-4D41-8AE6-20A05D026F46@amazon.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
> FWIW, I agree with Jacob's concern about it being a bad idea to let
> users of pg_upgrade pass down arbitrary options to pg_dump/pg_restore.
> I think we'd regret going there, because it'd hugely expand the set
> of cases pg_upgrade has to deal with.
> Also, pg_upgrade is often invoked indirectly via scripts, so I do
> not especially buy the idea that we're going to get useful control
> input from some human somewhere. I think we'd be better off to
> assume that pg_upgrade is on its own to manage the process, so that
> if we need to switch strategies based on object count or whatever,
> we should put in a heuristic to choose the strategy automatically.
> It might not be perfect, but that will give better results for the
> pretty large fraction of users who are not going to mess with
> weird little switches.
I have updated the patch to use heuristic, During pg_upgrade we count
Large objects per database. During pg_restore execution if db large_objects
count is greater than LARGE_OBJECTS_THRESOLD (1k) we will use 
--restore-blob-batch-size.
I also modified pg_upgrade --jobs behavior if we have large_objects (> LARGE_OBJECTS_THRESOLD)
+  /*  Restore all the dbs where LARGE_OBJECTS_THRESOLD is not breached */
+  restore_dbs(stats, true);
+  /* reap all children */
+  while (reap_child(true) == true)
+     ;
+  /*  Restore rest of the dbs one by one  with pg_restore --jobs = user_opts.jobs */
+  restore_dbs(stats, false);
   /* reap all children */
   while (reap_child(true) == true)
      ;
Regards
Sachin
| Attachment | Content-Type | Size | 
|---|---|---|
| pg_upgrade_improvements_v7.diff | application/octet-stream | 27.9 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2023-12-04 16:18:06 | Re: Proposal: In-flight explain logging | 
| Previous Message | Joe Conway | 2023-12-04 15:45:58 | Re: Emitting JSON to file using COPY TO |