From: | Andy Colson <andy(at)squeakycode(dot)net> |
---|---|
To: | Yngve Nysaeter Pettersen <yngve(at)opera(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Select for update with offset interferes with concurrent transactions |
Date: | 2011-02-01 17:11:23 |
Message-ID: | 4D483EBB.2090307@squeakycode.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:
> Hi,
>
> Thanks for the quick answer, Andy.
>
> On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy(at)squeakycode(dot)net>
> wrote:
>
> <snip>
>> So, if I understand correctly, you:
>>
>> q = SELECT record_id FROM queue
>> WHERE project_id = my_project AND state = idle
>> LIMIT n OFFSET i FOR UPDATE
>> while not q.eof
>> update queue set state = started where record_id = x;
>> process record_id
>> update queue set state = finsihed where record_id = x;
>> q.next;
>
> Almost, the update to "started" is done for all selected elements first,
> releasing the lock, then the items are processed one at a time, marking
> each "finished" as they complete. (each processing step can take
> minutes, so keeping a lock the whole time is not an option)
>
>> Might I suggest and alternative:
>>
>> q = update queue set state = started
>> WHERE project_id = my_project AND state = idle
>> LIMIT n OFFSET i
>> RETURNING project_id;
>> idlist = @q;
>> commit;
>>
>> foreach x in idlist
>> process record_id
>> begin
>> update queue set state = finsihed where record_id = x;
>> commit;
>>
>> Forgive the part perl part python sudocode. Oh, and I've never done
>> this, no idea if it actually works. :-)
>
> Thanks for that suggestion, I'll take a look at it.
>
> While I hadn't caught on to the "RETURNING" part, I had been wondering
> if using a single step UPDATE might be a solution. One concern I have is
> how concurrent updates will affect the returned list (or if they will
> just be skipped, as SELECT would in normal transaction mode, if I
> understood correctly), or whether it might return with an error code (I
> know that the normal update return value is the number of updated items,
> just not sure if that applies for "RETURNING").
>
> Although, I will note that this process (if it works) will, sort of,
> make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
> might cause issues for concurrent updates for the use-cases where FOR
> UPDATE is relevant.
>
Yeah, I'd wondered the same thing. It could be two updates hitting the
same row will deadlock, or maybe not, I'm not sure. But I think its the
same as with the select, if you happen to have two limits that hit the
same range, you're in trouble.
I think the random limit thing is a race condition itself. Whenever you
have multiple processes hitting the same rows you're going to run into
problems. Have you thought of using a sequence instead of a random
limit? Each process could get the next 100 record_id'd via a sequence,
then there would be much less chance of deadlock.
-Andy
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-02-01 17:18:17 | Re: Select for update with offset interferes with concurrent transactions |
Previous Message | Vick Khera | 2011-02-01 17:06:56 | Re: Book recommendation? |