Re: Speeding up pg_upgrade

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Alexander Kukushkin <cyberdemn(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speeding up pg_upgrade
Date: 2017-12-07 18:22:23
Message-ID: CA+TgmobEOrd8kZe1P4XaT-QrLdzpHqiDB+9KssR0ujwxQkcvDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 7, 2017 at 11:42 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
>> It seems pretty clear to me that we should somehow transfer stats from
>> the old server to the new one. Shouldn't it just be a matter of
>> serializing the MCV/histogram/ndistinct values, then have capabilities
>> to load on the new server?
>
> The reason pg_upgrade hasn't done that in the past is not wishing to
> assume that the new version does stats identically to the old version.
> Since we do in fact add stats or change stuff around from time to time,
> that's not a negligible consideration.

Yes, but we don't do that for every release. We could put rules into
pg_upgrade about which releases changed the stats format incompatibly,
and not transfer the stats when crossing between two releases with
incompatible formats. That's more than zero effort, of course, but it
might be worth it. We've already got CATALOG_VERSION_NO,
XLOG_PAGE_MAGIC, PG_CONTROL_VERSION, PG_PROTOCOL_LATEST,
BTREE_VERSION, HASH_VERSION, BRIN_CURRENT_VERSION,
GIN_CURRENT_VERSION, LOGICALREP_PROTO_VERSION_NUM,
PG_PAGE_LAYOUT_VERSION, PG_DATA_CHECKSUM_VERSION, K_VERS_MAJOR,
K_VERS_MINOR, K_VERS_REV, and the utterly unused MIGRATOR_API_VERSION.
Now, I have to admit that I find the process of trying to remember to
bump the correct set of version numbers in every commit just a tad
frustrating; it adds a cognitive burden I'd just as well skip.
However, the failure to transfer stats over the years seems to have
actually caused real problems for many users, so I think in this case
we might be best off sucking it up and adding one more version number.

We might even want to make it a little more fine-grained and track it
separately by data type, but I'm not sure if that's really worth it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-12-07 18:32:43 Re: Speeding up pg_upgrade
Previous Message Robert Haas 2017-12-07 17:59:06 Re: Fwd: [BUGS] pg_trgm word_similarity inconsistencies or bug