Re: pg_upgrade failing for 200+ million Large Objects

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, vignesh C <vignesh21(at)gmail(dot)com>, "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, Robins Tharakan <tharakan(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: 2024-07-28 21:24:29
Message-ID: 2422717.1722201869@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
>> J4F, I have an idea to count number of ';' sings and use it for
>> transaction size counter, since it is as upper bound estimate of
>> number of SQL commands :-)

> Hmm ... that's not a completely silly idea. Let's keep it in
> the back pocket in case we can't easily reduce the number of
> SQL commands in some cases.

After poking at this for awhile, we can fix Justin's example
case by avoiding repeated UPDATEs on pg_attribute, so I think
we should do that. It seems clearly a win, with no downside
other than a small increment of complexity in pg_dump.

However, that's probably not sufficient to mark this issue
as closed. It seems likely that there are other patterns
that would cause backend memory bloat. One case that I found
is tables with a lot of inherited constraints (not partitions,
but old-style inheritance). For example, load the output of
this Perl script into a database:

-----
for (my $i = 0; $i < 100; $i++)
{
print "CREATE TABLE test_inh_check$i (\n";
for (my $j = 0; $j < 1000; $j++)
{
print "a$j float check (a$j > 10.2),\n";
}
print "b float);\n";
print "CREATE TABLE test_inh_check_child$i() INHERITS(test_inh_check$i);\n";
}
-----

pg_dump is horrendously slow on this, thanks to O(N^2) behavior in
ruleutils.c, and pg_upgrade is worse --- and leaks memory too in
HEAD/v17. The slowness was there before, so I think the lack of
field complaints indicates that this isn't a real-world use case.
Still, it's bad if pg_upgrade fails when it would not have before,
and there may be other similar issues.

So I'm forced to the conclusion that we'd better make the transaction
size adaptive as per Alexander's suggestion.

In addition to the patches attached, I experimented with making
dumpTableSchema fold all the ALTER TABLE commands for a single table
into one command. That's do-able without too much effort, but I'm now
convinced that we shouldn't. It would break the semicolon-counting
hack for detecting that tables like these involve extra work.
I'm also not very confident that the backend won't have trouble with
ALTER TABLE commands containing hundreds of subcommands. That's
something we ought to work on probably, but it's not a project that
I want to condition v17 pg_upgrade's stability on.

Anyway, proposed patches attached. 0001 is some trivial cleanup
that I noticed while working on the failed single-ALTER-TABLE idea.
0002 merges the catalog-UPDATE commands that dumpTableSchema issues,
and 0003 is Alexander's suggestion.

regards, tom lane

Attachment Content-Type Size
v1-0001-Fix-missing-ONLY-in-one-dumpTableSchema-command.patch text/x-diff 1.8 KB
v1-0002-Reduce-number-of-commands-dumpTableSchema-emits-f.patch text/x-diff 8.5 KB
v1-0003-Count-individual-SQL-commands-in-pg_restore-s-tra.patch text/x-diff 2.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message David Rowley 2024-07-28 21:19:51 Re: Parent/child context relation in pg_get_backend_memory_contexts()