From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: In-Place upgrade concept |
Date: | 2007-07-03 18:53:26 |
Message-ID: | 12029.1183488806@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Tue, Jul 03, 2007 at 11:36:03AM -0400, Tom Lane wrote:
>> ... (Thought experiment: a page is read in during crash recovery
>> or PITR slave operation, and discovered to have the old format.)
> Hmm, actually, what's the problem with PITR restoring a page in the old
> format. As long as it's clear it's the old format it'll get fixed when
> the page is actually used.
Well, what I'm concerned about is something like a WAL record providing
a new-format tuple to be inserted into a page, and then you find that
the page contains old-format tuples.
[ thinks some more... ] Actually, so long as we are willing to posit that
1. You're only allowed to upgrade a DB that's been cleanly shut down
(no replay of old-format WAL logs allowed)
2. Page format conversion is WAL-logged as a complete page replacement
then AFAICS WAL-reading operations should never have to apply any
updates to an old-format page; the first touch of any old page in the
WAL sequence should be a page replacement that updates it to new format.
This is not different from the argument why full_page_writes ensures
recovery from write failures.
So in principle the page-conversion stuff should always operate in a
live transaction. (Which is good, because now that I think about it
we couldn't emit a WAL record for the page conversion in those other
contexts.) I still feel pretty twitchy about letting it do catalog
access, though, because it has to operate at such a low level of the
system. bufmgr.c has no business invoking anything that might do
catalog access. If nothing else there are deadlock issues.
On the whole I think we could define format conversions for user-defined
types as "not our problem". A new version of a UDT that has an
incompatible representation on disk can simply be treated as a new type
with a different OID, exactly as Zdenek was suggesting for index AMs.
To upgrade a database containing such a column, you install
"my_udt_old.so" that services the old representation, ALTER TYPE my_udt
RENAME TO my_udt_old, then install new type my_udt and start using that.
Anyway that seems good enough for version 1.0 --- I don't recall that
we've ever changed the on-disk representation of any contrib/ types,
so how important is this scenario in the real world?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2007-07-03 19:19:58 | ACM Paper relevant to our buffer algorithm |
Previous Message | Richard Huxton | 2007-07-03 18:45:33 | Re: Proposal: In-Place upgrade concept |