Re: In-place upgrade: catalog side

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 11:21:10
Message-ID: 871vwo2nkp.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:

> Here's a good example; that seems a perfect problem for somebody else to work
> on. I understand it now well enough to float ideas without even needing to see
> your code. Stop worring about it, I'll grab responsibility for making sure it
> gets done by someone.
>
> So, for everyone else who isn't Zdenek: when columns are dropped,
> pg_attribute.attisdropped turns true and atttypid goes to 0. pg_dump skips
> over them, and even if it didn't pg_restore doesn't know how to put them back.
> I can think of a couple of hacks to work around this, and one of them might
> even work:
>
> 1) Create a dummy type that exists only to flag these during conversion. Re-add
> all the deleted columns by turning off attisdropped and flag them with that
> type. Dump. Restore. Re-delete the columns. My first pass through poking
> holes in this idea wonders how the dump will go crazy if it finds rows that
> were created after the column was dropped, that therefore have no value for it.

No, those records would work fine, they will have the column set NULL. But in
any case it doesn't matter, you don't need to dump out the data at all --
that's kind of the whole point of the exercise after all :)

> Who wants to show off how much more they know about this than me by saying
> what's right or wrong with these various ideas?

*blush* :)

They all seem functional ideas. But it seems to me they're all ideas that
would be appropriate if this was a pgfoundry add-on for existing releases.
But if this is an integrated feature targeting future releases we have more
flexibility and there are more integrated approaches possible.

How about adding a special syntax for CREATE TABLE which indicates to include
a dropped column in that position? Then pg_dump could have a -X option to
include those columns as placeholders. Something like:

CREATE TABLE foo (
col1 integer,
NULL COLUMN,
col2 integer
);

> If we care about the fact that columns never go away and are using (1) or (2),
> could also consider adding some additional meta-data to 8.4 such that something
> like vacuum can flag when a column no longer exists in any part of the data.
> All deleted columns move from 8.3 to 8.4, but one day the 8.5 upgrade could
> finally blow them away. There's already plenty of per-table catalog data being
> proposed to push into 8.4 for making future upgrades easier, this seems like a
> possible candidate for something to make space for there. As I just came to
> appreciate the problem I'm not sure about that.

Hm, that's an interesting idea but I think it would work differently. If the
column is dropped but there are tuples where the column is present then vacuum
could squeeze the column out and set the null bit on each tuple instead. But
that would involve a lot of I/O so it wouldn't be something we would want to
do on a regular vacuum.

Actually removing the attribute is downright hard. You would have to have the
table locked, and squeeze the null bitmap -- and if you crash in the middle
your data will be toast. I don't see much reason to worry about dropping the
attribute though. The only cases where it matters are if you're near
MaxAttrNum (1600 columns IIRC) or if it's the only null column (and in a table
with more than 8 columns).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-12-04 11:41:04 Re: Sync Rep: First Thoughts on Code
Previous Message Heikki Linnakangas 2008-12-04 11:19:15 Re: cvs head initdb hangs on unixware