Re: ALTER TABLE DROP COLUMN

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "PostgreSQL Development" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: ALTER TABLE DROP COLUMN
Date: 2000-06-11 16:22:42
Message-ID: 9500.960740562@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> Seems we have 4 DROP COLUMN ideas:
>> Method Advantage
>> -----------------------------------------------------------------
>> 1 invisible column marked by negative attnum fast
>> 2 invisible column marked by is_dropped column fast
>> 3 make copy of table without column col removed
>> 4 make new tuples in existing table without column col removed

Bruce and I talked about this by phone yesterday, and we realized that
none of these are very satisfactory. #1 and #2 both have the flaw that
applications that examine pg_attribute will probably break: they will
see a sequence of attnum values with gaps in it. And what should the
rel's relnatts field be set to? #3 and #4 are better on that point,
but they leave us with the problem of renumbering references to columns
after the dropped one in constraints, rules, PL functions, etc.

Furthermore, there is a closely related problem that none of these
approaches give us much help on: recursive ALTER TABLE ADD COLUMN.
Right now, ADD puts the new column at the end of each table it's added
to, which often means that it gets a different column number in child
tables than in parent tables. That leads to havoc for pg_dump.

I think the only clean solution is to create a clear distinction between
physical and logical column numbers. Each pg_attribute tuple would need
two attnum fields, and pg_class would need two relnatts fields as well.
A column once created would never change its physical column number, but
its logical column number might change as a consequence of adding or
dropping columns before it. ADD COLUMN would ensure that a column added
to child tables receives the same logical column number as it has in the
parent table, thus solving the dump/reload problem. DROP COLUMN would
assign an invalid logical column number to dropped columns. They could
be numbered zero except that we'd probably still want a unique index on
attrelid+attnum, and the index would complain. I'd suggest using
Hiroshi's idea: give a dropped column a logical attnum equal to
-(physical_attnum + offset).

With this approach, internal operations on tuples would all use
physical column numbers, but operations that interface to the outside
world would present a view of only the valid logical columns. For
example, the parser would only allow logical columns to be referenced
by name; "SELECT *" would expand to valid logical columns in logical-
column-number order; COPY would send or receive valid logical columns
in logical-column-number order; etc.

Stored rules and so forth probably should store physical column numbers
so that they need not be modified during column add/drop.

This would require looking at all the places in the backend to determine
whether they should be working with logical or physical column numbers,
but the design is such that most all places would want to be using
physical numbers, so I don't think it'd be too painful.

Although I'd prefer to give the replacement columns two new names
(eg, "attlnum" and "attpnum") to ensure we find all uses, this would
surely break applications that examine pg_attribute. For compatibility
we'd have to recycle "attnum" and "relnatts" to indicate logical column
number and logical column count, while adding new fields (say "attpnum"
and "relnpatts") for the physical number and count.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-06-11 17:02:47 Re: Install error current CVS
Previous Message Bruce Momjian 2000-06-11 13:28:08 Re: Current initdb broken.