Re: Concurrent updates

From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Steve Erickson *EXTERN*" <serickson(at)digitiliti(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Concurrent updates
Date: 2013-03-19 08:32:43
Message-ID: A737B7A37273E048B164557ADEF4A58B057C73E0@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Steve Erickson wrote:
> I have a table that I want to use as a queue with all functionality (Insert, update, delete) embodied
> in a stored procedure. Inserts and deletes are no problem. An external program would call the stored
> procedure to get one or more emails to work on, selecting on "state='N'", then updating the row so
> "state='P'". My problem is having multiple threads calling the stored procedure simultaneously and
> getting the same row(s). Selecting FOR UPDATE won't work as, if thread #1 gets 3 rows and thread #2
> starts before thread #1 completes (Commits), thread #2 will select the same 3 rows as thread #1
> except, since thread #1 will update the state (changing the state to 'P') so that those rows no longer
> meet thread #2 criteria, and thread #2 will receive zero rows. The table looks like:
>
> CREATE TABLE dss.stage_email
> (
> emailid bigserial NOT NULL,
> email_path text,
> state character(1) DEFAULT 'N'::bpchar, -- N = New Email, P=Processing, D=Deleting
> fetch_date timestamp without time zone DEFAULT now(),
> ingest_date timestamp without time zone
> )

Strange, it works fine for me.

SELECT * FROM stage_email;

emailid | email_path | state | fetch_date | ingest_date
---------+------------+-------+----------------------------+-------------
1 | one | N | 2013-03-19 09:25:35.25905 |
2 | two | N | 2013-03-19 09:25:38.682343 |
3 | three | N | 2013-03-19 09:25:42.162118 |
4 | four | N | 2013-03-19 09:25:45.626052 |
(4 rows)

Session 1: BEGIN;
Session 1: SELECT * FROM stage_email WHERE state = 'N' ORDER BY emailid LIMIT 2 FOR UPDATE;

emailid | email_path | state | fetch_date | ingest_date
---------+------------+-------+----------------------------+-------------
1 | one | N | 2013-03-19 09:25:35.25905 |
2 | two | N | 2013-03-19 09:25:38.682343 |
(2 rows)

Session 2: BEGIN;
Session 2: SELECT * FROM stage_email WHERE state = 'N' ORDER BY emailid LIMIT 2 FOR UPDATE;
(Session 2 blocks)

Session 1: UPDATE stage_email SET state = 'P' WHERE emailid IN (1, 2);
Session 1: COMMIT;

Now session 2 gets this result:

emailid | email_path | state | fetch_date | ingest_date
---------+------------+-------+----------------------------+-------------
3 | three | N | 2013-03-19 09:25:42.162118 |
4 | four | N | 2013-03-19 09:25:45.626052 |
(2 rows)

So session 2 can continue processing the next two entries.

Maybe you can illustrate your problem with an example.

Yours,
Laurenz Albe

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2013-03-19 12:28:23 Re: [GENERAL] Trust intermediate CA for client certificates
Previous Message Craig Ringer 2013-03-19 06:14:49 Re: [HACKERS] Trust intermediate CA for client certificates