Re: Efficiently advancing a sequence without risking it going backwards.

From: Tim Cross <theophilusx(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Efficiently advancing a sequence without risking it going backwards.
Date: 2020-07-09 23:07:51
Message-ID: 87d0542soo.fsf@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Christopher Browne <cbbrowne(at)gmail(dot)com> writes:

> On Thu, 9 Jul 2020 at 12:59, Jeremy Schneider <schneider(at)ardentperf(dot)com>
> wrote:
>
>>
>> > On Jul 6, 2020, at 19:06, Paul McGarry <paul(at)paulmcgarry(dot)com> wrote:
>> >
>> > I don't think I can use setval(), because it risks making sequences go
>> backwards, eg:
>> >
>> > 1) Check values
>> > DB1sequence: 1234
>> > DB2sequence: 1233 (1 behind)
>> > 2) setval('DB2sequence',1234);
>> >
>> > but if between (1) and (2) there are 2 nextval(DB2sequence) calls on
>> another process, (2) would take the sequence back from 1235 to 1234 and I
>> would end up trying to create a duplicate key ID from the sequence.
>>
>> An ability to “lock” the sequence momentarily would give you the tool you
>> need, but I don’t think it’s there.
>>
>> Total hack, but if your application or users can retry when the rare error
>> is encountered then one idea is to rename the sequence momentarily while
>> you do the setval() then rename it back. Do an initial check without
>> renaming, then re-check after renaming and before the setval() call.
>>
>> If you put retry logic into your application then make sure to include
>> back-off logic so you don’t get an outage induced by thundering herd.
>>
>
> This is increasingly looking like a set of attempts to intentionally abuse
> what sequences were designed for.
>
> The use-case where you need a lock on the value so that there can't
> possibly be a hole in the sequence points at the notion of having some
> other kind of a function that takes out a lock on a table, and serially
> gives out "MAX+1" as the next value.
>
> That isn't a very difficult function to write; the problem with it is that
> that sort of function will forcibly serialize all inserts through the
> function+table lock that is giving out "MAX+1" values. That's going to be
> WAY slower than using a sequence object, and about 98% of the time, people
> will prefer the sequence object, particularly because it's about 98% faster.
>
> I'm not quite sure if anyone has put out there a standard-ish idiom for
> this; that seems like a not TOO difficult "exercise for the user."
>
> There will definitely be more failure cases, and *wildly* more fighting, in
> a concurrent environment, over tuple locks.
>
> - An obvious failure is that if one connection asks for the new MAX+1, gets
> it, and then the transaction fails, for some later, out-of-relevant-scope,
> reason, you'll still potentially get some "holes" in the series of values.
>
> - If there are 10 connections trying to get MAX+1 concurrently, only one
> can get it at a time, and that connection can't relinquish the lock until
> its transaction has completed, and the 9 must wait, regardless of how much
> work the "winner" still has to do.
>
> These are amongst the reasons why people conclude they *don't* want that
> kind of functionality.
>
> It makes me think that the problem needs to be taken back to that initial
> point of "I think I need some somewhat coordinated sequences", and poke at
> what the *real* requirement is there, and why someone thinks that the
> values should be "somewhat coordinated." Something seems off there.

I agree and was going to write something similar. All the 'solutions'
are problematic in one way or the other and seem to be due to a
misconception about the role for sequences or some requirement which
needs to be re-examined.
--
Tim Cross

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michel Pelletier 2020-07-09 23:10:27 invalid non-zero objectSubId for object class
Previous Message Tom Lane 2020-07-09 21:39:40 Re: Is this a bug in pg_current_logfile() on Windows?