From: | Steve Midgley <public(at)misuse(dot)org> |
---|---|
To: | "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Increment a sequence by more than one |
Date: | 2007-08-03 19:28:32 |
Message-ID: | 20070803192851.0D4449FB1D6@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Hi Scott,
Thanks for this info (and Michael too!).
Let me see if I understand your suggestion. I would run these three
commands in sequence:
# select nextval('[my_seq_name]');
returns => 52 [I believe that the sequence is at 52]
# alter sequence [my_seq_name] increment by 5000;
# select nextval('[my_seq_name]');
returns => 5052
If the third command doesn't return "5052" - i.e. it returns 5053, then
I know that somewhere in this sequence another process grabbed an id
out from under me. It doesn't matter where, but I would know that my
5000 id's are not unique and should be discarded? If the third command
DOES return 5052, then I know that those 5000 id's are "locked" for my
use and no other application could have grabbed one of them?
Can anyone see a flaw in that? It looks right to me..
Scott - it also seems to me that I need not waste all those id's if
another application does grab one during my statement: If I detect a
failure, I could just reset the pk sequence back to the max id of the
underlying table before trying again. I think this code would do it
(stolen from Ruby's postgres adaptor):
SELECT setval('[seq_name]', (SELECT COALESCE(MAX([pk_of_seq]), (SELECT
min_value FROM [seq_name])) FROM [table_of_pk]), false)
So for table "property" with pk of "id" and sequence name
"property_id_seq":
SELECT setval('property_id_seq', (SELECT COALESCE(MAX(id), (SELECT
min_value FROM property_id_seq)) FROM property), false)
Then I could just retry the first process - though on a table that is
very busy, I might have retry for a while..
Regarding Michael's suggestion - I tried messing around with LOCK and
similar commands but they're only allowed to run against TABLES not
SEQUENCES - too bad - that would have been perfect.
I'm now starting to think that there's no way to solve this problem in
an "elegant manner" even in a stored procedure? Your method seems to be
as good as it's going to get? (Not that I'm complaining!)
Thanks again - any thoughts are welcome,
Steve
At 08:01 PM 8/3/2007, Scott Marlowe wrote:
>On 8/3/07, Steve Midgley <public(at)misuse(dot)org> wrote:
> >
> > Hi,
> >
> > I'm writing an import app in a third party language. It's going to
> use
> > "copy to" to move data from STDIN to a postgres (8.2) table. There
> are some
> > complexities though: it's going to copy the records to a
> "versioned" table
> > first, and then at a later time the records will be copied by a
> different
> > procedure to the "live" table. The live table and versioned table
> are
> > identical in terms of their field definitions. But there is no
> sequence
> > associated with the versioned table (whose primary key is "id" plus
> > "import_group_id", whereas the live table's pk is just "id"). So
> all
> > versioned table entries must already "know" what their id would be
> in the
> > live table. (This makes sense for other business process we have,
> but it's a
> > bit of a problem in this instance).
> >
> > My problem: I'd like to be able to grab a block of id's from the
> live
> > table's pk sequence. So let's say my importer has 5,000 new rows to
> import
> > and the current max pk in the live table is 540,203. I'd like to be
> able to
> > increment the primary key sequence in such a way that I get a block
> of ids
> > all to myself and the sequence is reset to 545,203 with a guarantee
> that all
> > the id's between 540203 and 545203 are unused.
>
>The real danger in doing this is race conditions. Most anything you
>do involves a possible race condition. As long as the race condition
>doesn't result in an id getting used twice, you're safe.
>
>So:
>
>test=# create sequence m;
>CREATE SEQUENCE
>test=# select nextval('m');
> nextval
>---------
> 1
>(1 row)
>
>test=# alter sequence m increment by 5000;
>ALTER SEQUENCE
>test=# select nextval('m');
> nextval
>---------
> 5001
>(1 row)
>
>test=# alter sequence m increment by 1;
>ALTER SEQUENCE
>test=# select nextval('m');
> nextval
>---------
> 5002
>(1 row)
>
>In this method, the only possible race condition is that someone might
>run a nextval('m') between the time you set the increment to 5000 and
>1 again. If that happens, you'd have 5,000 missing ids, but since
>sequences are designed to prevent dupes, not holes, that's ok.
>
> > But since I've seen so much magic on display from people on this
> list, I'm
> > going to ask if it's possible to do this solely from PG SQL sent
> from a
> > third party language? The main tricky bit seems to be ensuring that
> > everything is locked so two such increment calls at the same time
> don't
> > yield overlapping blocks of ids. Is there a way to "lock" the
> sequence
> > generator for the duration of a "nextval" and "setval" call?
>
>Avoiding the setval is the real key. It doesn't scale. Missing 5,000
>ids is no big deal. repeating them IS a big deal. Not using setval
>is the best answer.
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2007-08-03 19:42:59 | Re: Increment a sequence by more than one |
Previous Message | Scott Marlowe | 2007-08-03 19:01:38 | Re: Increment a sequence by more than one |