From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "fabriziomello(at)gmail(dot)com" <fabriziomello(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us> |
Subject: | Re: pg_upgrade: Pass -j down to vacuumdb |
Date: | 2019-04-03 21:24:34 |
Message-ID: | 20190403212434.GY17544@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 03, 2019 at 04:42:14PM -0400, Jeff Janes wrote:
> So maybe the first stage could
> be run by pg_upgrade itself, while the new server is still running on a
> linux socket in a private directory.
I think that would take too long. It would be less of an issue if there was
feedback/progress from pg_upgrade during the analyze.
For our upgrades (which typically take ~15min but several customers take up to
~60min), I only analyze base tables (essentially, those which are neither
parents nor children), then start services, then ANALYZE with default stats
target. I would want to avoid delaying services restart for more than another
(say) 5 minutes, and I would want to avoid even that unless there was a
progress report indicating that it's projected to take only a few more minutes.
I just did a test on one of our large-but-not-huge customers. With
stats_target=1, analyzing a 145GB partitioned table looks like it'll take
perhaps an hour; they have ~1TB data, so delaying services during ANALYZE would
nullify the utility of pg_upgrade. I can restore the essential tables from
backup in 15-30 minutes.
It might be fine if pg_upgrade took an option which enabled analyze, perhaps
instead of outputting analyze_new_cluster.sh. But actually, a problem with
*that* is that currently pg_upgrade avoids starting the new cluster. That
seems to be deliberate, since, with --link, that's an irreversible operation:
it's unsafe to start the old cluster afterwards.
Tangent: I have a queued mail from ~15 months ago wherein I proposed adding to
pg_upgrade an option to remove the old data dir (or probably only the files
associated with known relations). I realized at the time that would be pretty
scary without having first verified that the new cluster at least starts. I'm
not sure how good an idea that is, but --startnewcluster would be needed there,
too.
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Sergei Kornilov | 2019-04-03 21:27:55 | Re: allow online change primary_conninfo |
Previous Message | Tom Lane | 2019-04-03 21:17:02 | Re: CPU costs of random_zipfian in pgbench |