From: | Serge Rielau <serge(at)rielau(dot)com> |
---|---|
To: | Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Fast AT ADD COLUMN with DEFAULTs |
Date: | 2016-10-05 21:56:31 |
Message-ID: | 2e441c6a-0835-403f-a342-8c8df9ea5955@rielau.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 5, 2016 at 2:45 PM, Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com> wrote:
On 10/5/16, Serge Rielau <serge(at)rielau(dot)com> wrote:
> Dear Hackers,
>
> I’m working on a patch that expands PG’s ability to add columns to a table
> without a table rewrite (i.e. at O(1) cost) from the
> nullable-without-default to a more general case. E.g.
...
> Is there an interest in principle in the community for this functionality?
Wow! I think it would be great! It also solves huge vacuuming after
rewriting the table(s).
Just pay attention to corner cases like indexes, statistics and speed. Yes, Yes, and still analyzing speed
But I'd like to see solution for more important cases like:
CREATE TABLE t (pk INT NOT NULL PRIMARY KEY);
INSERT INTO t VALUES (1), (2), (3);
ALTER TABLE t ADD COLUMN c1 timestamptz NOT NULL DEFAULT 'now';
SELECT * FROM t ORDER BY pk;
ALTER TABLE t ADD COLUMN c2 serial;
SELECT * FROM t ORDER BY pk;
INSERT INTO t(pk) VALUES (4);
SELECT * FROM t ORDER BY pk; By solution I think you mean a semantic change from what it is doing today which is: * “Now” is fixed to ALTER TABLE time for all pre-existing rows * serial will fill in the same value for all pre-existing rows Having different semantics for those would require a rewrite and probably different syntax in some form.
This is what my patch does on our PG derivative today: CREATE TABLE t (pk INT NOT NULL PRIMARY KEY); CREATE TABLE postgres=# INSERT INTO t VALUES (1), (2), (3); INSERT 0 3 postgres=# ALTER TABLE t ADD COLUMN c1 timestamptz NOT NULL DEFAULT 'now'; ALTER TABLE postgres=# SELECT * FROM t ORDER BY pk; pk | c1 ----+------------------------------- 1 | 2016-10-05 21:47:58.919194+00 2 | 2016-10-05 21:47:58.919194+00 3 | 2016-10-05 21:47:58.919194+00 (3 rows)
postgres=# postgres=# ALTER TABLE t ADD COLUMN c2 serial; SELECT * FROM t ORDER BY pk; INSERT INTO t(pk) VALUES (4); SELECT * FROM t ORDER BY pk;
ALTER TABLE t ADD COLUMN c2 serial; ALTER TABLE postgres=# SELECT * FROM t ORDER BY pk; pk | c1 | c2 ----+-------------------------------+---- 1 | 2016-10-05 21:47:58.919194+00 | 1 2 | 2016-10-05 21:47:58.919194+00 | 1 3 | 2016-10-05 21:47:58.919194+00 | 1 (3 rows)
postgres=# INSERT INTO t(pk) VALUES (4); INSERT 0 1 postgres=# SELECT * FROM t ORDER BY pk; pk | c1 | c2 ----+-------------------------------+---- 1 | 2016-10-05 21:47:58.919194+00 | 1 2 | 2016-10-05 21:47:58.919194+00 | 1 3 | 2016-10-05 21:47:58.919194+00 | 1 4 | 2016-10-05 21:47:58.919194+00 | 2 (4 rows) P.S.: I really think it is a good idea, just some research is
necessary and covering corner cases... Thanks. This would be my first contribution. I take it I would post a patch based on a recent PG 9.6 master for review? Or should I compose some sort of a design document?
Cheers Serge Rielau Salesforce.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2016-10-05 22:15:40 | Re: Fast AT ADD COLUMN with DEFAULTs |
Previous Message | Tom Lane | 2016-10-05 21:50:30 | Switch to unnamed POSIX semaphores as our preferred sema code? |