From: | Andy Colson <andy(at)squeakycode(dot)net> |
---|---|
To: | Yngve Nysaeter Pettersen <yngve(at)opera(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Select for update with offset interferes with concurrent transactions |
Date: | 2011-02-01 15:19:17 |
Message-ID: | 4D482475.6050408@squeakycode.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2/1/2011 6:32 AM, Yngve Nysaeter Pettersen wrote:
> Hello all,
>
> I am in the process of migrating a system from Postgresql 8.3 to 9.0,
> and have run into a problem with the task queue systems I am using.
>
> The task queue controls the allocation of tasks between about 1000
> processes working in parallel, and is essentially a table of
>
> record_id (unique)
> project_id
> task_description_id
> state (idle, started, finished)
>
> Each project currently have about 2 million entries. My plan is to
> increase that significantly the next few months.
>
> To avoid having the processes trample each other's queries (the first
> attempt was to select the first matching entries of the table, which
> caused one to block all other transactions), one of the steps I took was
> to select a set of idle rows at a random offset into the table from the
> project, mark them for update, then update each record's state as started.
>
> SELECT record_id FROM queue WHERE project_id = my_project AND state =
> idle LIMIT n OFFSET i FOR UPDATE
>
> At present "n" is 100-150, "i" is a random value in the range 0-10000.
>
> There is, intentionally, no ordering specified, since that would just
> slow down the query, and is not necessary.
>
> For reference, the above query is sent through Django's cursor.execute()
> call in a manual transaction block.
>
>
>
> What I've discovered when using Postgres 9.0 is that the processes are
> now blocking every other query into this table, apparently reducing the
> task processing speed by at least a factor of 10, and increasing the
> load on the server by a similar factor, compared to when Postgres 8.3
> was used. The problem is apparent just after starting, with only 50-100
> processes active (startup is staggered).
>
> Reducing "n" (and looping), or increasing the "i" range did not work.
>
>
> The reason seems to be this new part of
> http://www.postgresql.org/docs/9.0/static/sql-select.html (towards the
> end of the FOR UPDATE section):
>
> If a LIMIT is used, locking stops once enough rows have been returned to
> satisfy the limit
> (but note that rows skipped over by OFFSET will get locked). Similarly,
> if FOR UPDATE or
> FOR SHARE is used in a cursor's query, only rows actually fetched or
> stepped past by the
> cursor will be locked.
>
> I can't find similar text in the 8.3 or 8.4 documentation.
>
> AFAICT, and assuming I have not misunderstood this part of the
> documentation this means that if one of my processing nodes selects a
> block of 100 entries at offset 8000 in the resulting table, then every
> other node will be blocked while the block is being processed, not just
> the nodes that would have selected the rows in the range 0 to 7999, but
> also >=8100, because they cannot gain access to the rows.
>
> Also, using FOR SHARE does not seem to solve the problem.
>
> IMO, as a database non-expert, locking rows that were not returned as a
> result of the query is a bug. As an example, if a query selects the X
> last items in the matching rows, that is equivalent to locking the
> table, or the relevant part of it, even if the requester have no
> intention to modify those other rows.
>
>
> Is there any way to avoid this problem? Or do I have to add a random
> batch_id field to the queue table in order to separate the processes'
> queries so that they do not block each other (as frequently)?
>
> Is it possible to disable the source code causing this (that is,
> reverting the patch that introduced the problem, or changing a
> configuration switch)?
>
>
So, if I understand correctly, you:
q = SELECT record_id FROM queue
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i FOR UPDATE
while not q.eof
update queue set state = started where record_id = x;
process record_id
update queue set state = finsihed where record_id = x;
q.next;
Might I suggest and alternative:
q = update queue set state = started
WHERE project_id = my_project AND state = idle
LIMIT n OFFSET i
RETURNING project_id;
idlist = @q;
commit;
foreach x in idlist
process record_id
begin
update queue set state = finsihed where record_id = x;
commit;
Forgive the part perl part python sudocode. Oh, and I've never done
this, no idea if it actually works. :-)
-Andy
From | Date | Subject | |
---|---|---|---|
Next Message | hlcborg | 2011-02-01 15:36:56 | Problem with encode () and hmac() in pgcrypto |
Previous Message | Dario Beraldi | 2011-02-01 15:08:50 | Cluster table and order information |