Re: pg_upgrade failing for 200+ million Large Objects

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, vignesh C <vignesh21(at)gmail(dot)com>, "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, Robins Tharakan <tharakan(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: 2024-07-26 22:55:00
Message-ID: CAPpHfdtpr2FYVYqmARZdSp=2zsndgJgmSsHFxFFNTiYRfX8uaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 27, 2024 at 1:37 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> > On Fri, Jul 26, 2024 at 10:53:30PM +0300, Alexander Korotkov wrote:
> >> It would be nice to identify such cases and check which memory contexts are
> >> growing and why.
>
> > I reproduced the problem with this schema:
>
> > SELECT format('CREATE TABLE p(i int, %s) PARTITION BY RANGE(i)', array_to_string(a, ', ')) FROM (SELECT array_agg(format('i%s int', i))a FROM generate_series(1,999)i);
> > SELECT format('CREATE TABLE t%s PARTITION OF p FOR VALUES FROM (%s)TO(%s)', i,i,i+1) FROM generate_series(1,999)i;
>
> > This used over 4 GB of RAM.
>
> Interesting. This doesn't bloat particularly much in a regular
> pg_restore, even with --transaction-size=1000; but it does in
> pg_upgrade, as you say. I found that the bloat was occurring
> during these long sequences of UPDATE commands issued by pg_upgrade:
>
> -- For binary upgrade, recreate inherited column.
> UPDATE pg_catalog.pg_attribute
> SET attislocal = false
> WHERE attname = 'i'
> AND attrelid = '\"public\".\"t139\"'::pg_catalog.regclass;
>
> -- For binary upgrade, recreate inherited column.
> UPDATE pg_catalog.pg_attribute
> SET attislocal = false
> WHERE attname = 'i1'
> AND attrelid = '\"public\".\"t139\"'::pg_catalog.regclass;
>
> -- For binary upgrade, recreate inherited column.
> UPDATE pg_catalog.pg_attribute
> SET attislocal = false
> WHERE attname = 'i2'
> AND attrelid = '\"public\".\"t139\"'::pg_catalog.regclass;
>
> I think the problem is basically that each one of these commands
> causes a relcache inval, for which we can't reclaim space right
> away, so that we end up consuming O(N^2) cache space for an
> N-column inherited table.

I was about to report the same.

> It's fairly easy to fix things so that this example doesn't cause
> that to happen: we just need to issue these updates as one command
> not N commands per table. See attached. However, I fear this should
> just be considered a draft, because the other code for binary upgrade
> in the immediate vicinity is just as aggressively stupid and
> unoptimized as this bit, and can probably also be driven to O(N^2)
> behavior with enough CHECK constraints etc. We've gone out of our way
> to make ALTER TABLE capable of handling many updates to a table's DDL
> in one command, but whoever wrote this code appears not to have read
> that memo, or at least to have believed that performance of pg_upgrade
> isn't of concern.

I was thinking about counting actual number of queries, not TOC
entries for transaction number as a more universal solution. But that
would require usage of psql_scan() or writing simpler alternative for
this particular purpose. That looks quite annoying. What do you
think?

------
Regards,
Alexander Korotkov
Supabase

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-07-26 23:06:21 Re: pg_upgrade failing for 200+ million Large Objects
Previous Message Tom Lane 2024-07-26 22:37:10 Re: pg_upgrade failing for 200+ million Large Objects