Re: pg_upgrade failing for 200+ million Large Objects

From: Robins Tharakan <tharakan(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: 2021-03-08 14:13:02
Message-ID: CAEP4nAw2WA1wyb9LG7BOEuN3Xr-xWiZZ0w_hKtpyvdUPKmcAJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Magnus,

On Mon, 8 Mar 2021 at 23:34, Magnus Hagander <magnus(at)hagander(dot)net> wrote:

> AFAICT at a quick check, pg_dump in binary upgrade mode emits one

lo_create() and one ALTER ... OWNER TO for each large object - so with
> 500M large objects that would be a billion statements, and thus a
> billion xids. And without checking, I'm fairly sure it doesn't load in
> a single transaction...
>

Your assumptions are pretty much correct.

The issue isn't with pg_upgrade itself. During pg_restore, each Large
Object (and separately each ALTER LARGE OBJECT OWNER TO) consumes an XID
each. For background, that's the reason the v9.5 production instance I was
reviewing, was unable to process more than 73 Million large objects since
each object required a CREATE + ALTER. (To clarify, 73 million = (2^31 - 2
billion magic constant - 1 Million wraparound protection) / 2)

Without looking, I would guess it's the schema reload using
> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
> running all of those in a single transaction would be a better choice?
> One could argue it's still not a proper fix, because we'd still have a
> huge memory usage etc, but it would then only burn 1 xid instead of
> 500M...
>
(I hope I am not missing something but) When I tried to force pg_restore to
use a single transaction (by hacking pg_upgrade's pg_restore call to use
--single-transaction), it too failed owing to being unable to lock so many
objects in a single transaction.

This still seems to just fix the symptoms and not the actual problem.
>

I agree that the patch doesn't address the root-cause, but it did get the
upgrade to complete on a test-setup. Do you think that (instead of all
objects) batching multiple Large Objects in a single transaction (and
allowing the caller to size that batch via command line) would be a good /
acceptable idea here?

Please take a look at your email configuration -- all your emails are
> lacking both References and In-reply-to headers.
>

Thanks for highlighting the cause here. Hopefully switching mail clients
would help.
-
Robins Tharakan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2021-03-08 14:35:52 Re: [PATCH] pg_permissions
Previous Message Alvaro Herrera 2021-03-08 14:10:26 Re: [PATCH] pgbench: improve \sleep meta command