Re: Help with pre-loaded arbitrary key sequences

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "James B(dot) Byrne" <byrnejb(at)harte-lyne(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Help with pre-loaded arbitrary key sequences
Date: 2008-01-16 23:40:13
Message-ID: dcc563d10801161540q3ecba281t4b500b6081ab0033@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Jan 11, 2008 10:43 AM, James B. Byrne <byrnejb(at)harte-lyne(dot)ca> wrote:
> I am prototyping a system migration that is to employ Ruby, Rails and
> PostgreSQL. Rails has the convention that the primary key of a row is an
> arbitrary integer value assigned by the database manager through a
> sequence. As it turns out, the legacy application employs essentially the
> same convention in most instances.
>
> My question is this: Can one assign an id number to a sequenced key column
> on create and override the sequencer? If one does this then can and, if
> so, how does the sequencer in Postgresql handle the eventuality of running
> into a block of keys holding previously assigned numbers?
>
> For example. The existing client master dataset employs an eight digit
> account number as primary key. The values in use tend to cluster in
> groups at each thousand increment, thus 1..375, 1001..1288, 2001..2225,
> 3001..3312, ..., 2001001..2001476, ..., etc.
>
> Assuming that these existing entries were all loaded into the new table
> with the values given as their primary keys and given that one could not
> simply start the sequencer at a value above the highest existing value: If
> one was to add a record and auto-generate a sequence number then can the
> sequencer handle looping from 1 to 375 and returning 376 and then continue
> until it reaches 1001, when it needs be loop again until 1288 and then
> return 1289 and so forth?

You're essentially wanting to fill in the blanks here. If you need
good performance, then what you'll need to do is to preallocate all
the numbers that haven't been assigned somewhere. So, we make a table
something like:

create table locatorcodes (i int, count_id serial);

Then we insert an id into that table for everyone that's missing from
the main table:

insert into locatorcodes (i)
select b.i from (
select * from generate_series(1,1000000)as i
) as b
left join main_table a on (b.i=a.i)
where a.i is null;

Or something like that.

Now, we've got a table with all the unused ids, and a serial count
assigned to them. Create another sequence:

create checkout_sequence;

and use that to "check out" numbers from locatorcodes:

select i from locatorcodes where count_id=nextval('checkout_sequence');

And since the sequence will just count up, there's little or no
problems with performance.

There's lots of ways of handling this. That's just one of the ones
that doesn't slow your database down a lot.

If you need to, you can shuffle the numbers going into the
locatorcodes table with an order by random() when you create it.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Erik Jones 2008-01-17 00:29:12 Re: Sun acquires MySQL
Previous Message Raymond O'Donnell 2008-01-16 23:18:45 Re: Sun acquires MySQL