Re: speed up pg_upgrade with large number of tables

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: "杨伯宇(长堂)" <yangboyu(dot)yby(at)alibaba-inc(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "rhaas(at)postgresql(dot)org" <rhaas(at)postgresql(dot)org>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: speed up pg_upgrade with large number of tables
Date: 2024-07-05 08:26:13
Message-ID: 14FF292A-5DD0-4DC6-A7C5-82E7936B4940@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 5 Jul 2024, at 09:12, 杨伯宇(长堂) <yangboyu(dot)yby(at)alibaba-inc(dot)com> wrote:

> 1: Skip Compatibility Check In "pg_upgrade"
> =============================================
> Concisely, we've got several databases, each with a million-plus tables.
> Running the compatibility check before pg_dump can eat up like half an hour.
> If I have performed an online check before the actual upgrade, repeating it
> seems unnecessary and just adds to the downtime in many situations.
>
> So, I'm thinking, why not add a "--skip-check" option in pg_upgrade to skip it?
> See "1-Skip_Compatibility_Check_v1.patch".

How would a user know that nothing has changed in the cluster between running
the check and running the upgrade with a skipped check? Considering how
complicated it is to understand exactly what pg_upgrade does it seems like
quite a large caliber footgun.

I would be much more interested in making the check phase go faster, and indeed
there is ongoing work in this area. Since it sounds like you have a dev and
test environment with a big workload, testing those patches would be helpful.
https://commitfest.postgresql.org/48/4995/ is one that comes to mind.

--
Daniel Gustafsson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-07-05 08:34:06 Re: Cleanup: PGProc->links doesn't need to be the first field anymore
Previous Message Peter Smith 2024-07-05 08:17:22 Re: Pgoutput not capturing the generated columns