Re: Select for update with offset interferes with concurrent transactions

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Yngve Nysaeter Pettersen <yngve(at)opera(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Select for update with offset interferes with concurrent transactions
Date: 2011-02-01 17:11:23
Message-ID: 4D483EBB.2090307@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:
> Hi,
>
> Thanks for the quick answer, Andy.
>
> On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy(at)squeakycode(dot)net>
> wrote:
>
> <snip>
>> So, if I understand correctly, you:
>>
>> q = SELECT record_id FROM queue
>> WHERE project_id = my_project AND state = idle
>> LIMIT n OFFSET i FOR UPDATE
>> while not q.eof
>> update queue set state = started where record_id = x;
>> process record_id
>> update queue set state = finsihed where record_id = x;
>> q.next;
>
> Almost, the update to "started" is done for all selected elements first,
> releasing the lock, then the items are processed one at a time, marking
> each "finished" as they complete. (each processing step can take
> minutes, so keeping a lock the whole time is not an option)
>
>> Might I suggest and alternative:
>>
>> q = update queue set state = started
>> WHERE project_id = my_project AND state = idle
>> LIMIT n OFFSET i
>> RETURNING project_id;
>> idlist = @q;
>> commit;
>>
>> foreach x in idlist
>> process record_id
>> begin
>> update queue set state = finsihed where record_id = x;
>> commit;
>>
>> Forgive the part perl part python sudocode. Oh, and I've never done
>> this, no idea if it actually works. :-)
>
> Thanks for that suggestion, I'll take a look at it.
>
> While I hadn't caught on to the "RETURNING" part, I had been wondering
> if using a single step UPDATE might be a solution. One concern I have is
> how concurrent updates will affect the returned list (or if they will
> just be skipped, as SELECT would in normal transaction mode, if I
> understood correctly), or whether it might return with an error code (I
> know that the normal update return value is the number of updated items,
> just not sure if that applies for "RETURNING").
>
> Although, I will note that this process (if it works) will, sort of,
> make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
> might cause issues for concurrent updates for the use-cases where FOR
> UPDATE is relevant.
>

Yeah, I'd wondered the same thing. It could be two updates hitting the
same row will deadlock, or maybe not, I'm not sure. But I think its the
same as with the select, if you happen to have two limits that hit the
same range, you're in trouble.

I think the random limit thing is a race condition itself. Whenever you
have multiple processes hitting the same rows you're going to run into
problems. Have you thought of using a sequence instead of a random
limit? Each process could get the next 100 record_id'd via a sequence,
then there would be much less chance of deadlock.

-Andy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2011-02-01 17:18:17 Re: Select for update with offset interferes with concurrent transactions
Previous Message Vick Khera 2011-02-01 17:06:56 Re: Book recommendation?