Re: Serials: removing the holes? (consecutive)

From: Masaru Sugawara <rk73(at)sea(dot)plala(dot)or(dot)jp>
To: "Leland F(dot) Jackson, CPA" <smvfp(at)mail(dot)smvfp(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Serials: removing the holes? (consecutive)
Date: 2002-08-05 10:26:02
Message-ID: 20020805192451.900E.RK73@sea.plala.or.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, 4 Aug 2002 09:02:02 -0500
"Leland F. Jackson, CPA" <smvfp(at)mail(dot)smvfp(dot)com> wrote:

> It is usually a bad idea to change a primary unique key in a table once it
> has be placed into use. This can especially be problamatic where other
> tables have foreign key constraints referenced into the table's primary key
> to enforce referential integrity.
> Also, the concept of next record and previous record does not exist in most
> enterprise class databases. All records are accessed by selection. This
> include postgres. Therefore, it is not possible to step through a postgres
> table one record at a time while updating the unique primary key with a new
> sequence number. Further complicating matters is when establishing a new
> sequence of number the danger exist that a currently used number may be
> duplicated causing an error with corresponding roll back to all updates by
> postgres.

I would also think that most DBs have none of such a concept, and it is
probably hard to avoid being duplicated the primary key by some other
sessions while a table's owner updates them. However, many of relational
DB have LOCK modes. if a target table is locked with an appropriate mode,
it is possible to prevent the others from inserting and deleting. And as far
as v7.2.1 at least, it seems to me that the owner don't need to be anxious
about the duplications even though a roll back occurs. Because a new
sequence is supposed to only have the numbers within the maximum number
of the old sequence(see below). Actually, if I intentionally add an error
statement next the UPDATE statement in the transaction, then the transaction
is aborted. But PG will succeed in rolling back without duplicating.

old key temp key new key
-------------------------------------
1 1 1
3 -> 2 -> 2 updating if old key <> temp key
4 -> 3 -> 3 updating if old key <> temp key
5 -> 4 -> 4 updating if old key <> temp key
(setval)
5 new inserting

> However, if you wanted to proceed with this, I would build a file, probably
> using perl, of the OID numbers. Each row of your table should have a unique
> OID number which postgres automatically add of each row in a table at the
> time the row is inserted.
>
> Once I had a file of OID numbers, I would build a program that read each
> line of the file, grab the corresponding row in your table and insert the
> new sequence number.

I believe your approach is one of the right ways. BTW, where do you think
you are going to insert the new sequence number ? If into new column,
you'll probably have to treat the new rows that inserted while you are
grabbing the old rows, I guess. After all, I think this way also needs to
lock tables and/or to kill all connections so that no rows are inserted or
deleted. But, from a reliable point of view, I realize your thought is better
than mine because of no change of a primary key

From your reply, I notice I should have said that there were some dangerous
points if a primary key was updated, and there is still a bag in my idea.
Thank you for your advice.

Note:
* to change the number of a primary key is not good(but I don't
think it is bad).
* how to deal with some other tables that refer to the primary key
of a target table as a foreign key
* whenever setval() is executed unless it is in the outside of a
transaction which includes a lock mode, there is possibility that
changing a sequence will cause an error of a duplicate key
* the fact that this method may not run correctly in the prior PG
versions.
* this method depends on the behavior of PG's sequence-and-
transaction systems, and basically no error occurs unless updating
is executed in the order of the sort
* LIMIT ALL statement cannot be removed from the UPDATE
statement, because the temp. sequence acts unexpectedly in the
WHERE clause if there is not its statement in the subquery.
* and, etc.

-- to eliminate gaps of the sequence
BEGIN;
LOCK TABLE tbl IN ACCESS EXCLUSIVE MODE;
CREATE TEMP SEQUENCE seq_n;
UPDATE tbl SET a = t1.i
FROM (SELECT t0.*, nextval('seq_n') AS i
FROM (SELECT * FROM tbl ORDER BY a) AS t0
LIMIT ALL ) AS t1
WHERE t1.a <> t1.i
AND tbl.a = t1.a;
DROP SEQUENCE seq_n;
-- SELECT setval('seq_p_key', (SELECT COUNT(*) FROM tbl)); <- mistake
SELECT setval('seq_p_key', (SELECT max(a) FROM tbl));
END;

Regards,
Masaru Sugawara

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2002-08-05 10:48:02 Puzzling planner choice (non-urgent)
Previous Message marko.asplund 2002-08-05 09:57:42 concept question: PostgreSQL vs. Oracle database