UPDATE grabs multiple rows when it seems like it should only grab one

From: Kevin Burke <burke(at)shyp(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: UPDATE grabs multiple rows when it seems like it should only grab one
Date: 2016-04-22 22:56:17
Message-ID: CAEYV4pYAG+o2_SO7+Tc9aUF8A9aZW=bNi2H7yd39i9fDy5nEEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,
I'm trying to write a job queue that grabs one job at a time from the
queue. I expect that the following query should update a maximum of one row
in the table:

UPDATE queued_jobs
SET status='in-progress',
updated_at=now()
FROM (
SELECT id AS inner_id
FROM queued_jobs
WHERE status='queued'
AND name = $1
AND run_after <= now()
LIMIT 1
FOR UPDATE
) find_job
WHERE queued_jobs.id = find_job.inner_id
AND status='queued'
RETURNING id,
name,
attempts,
run_after,
expires_at,
status,
data,
created_at,
updated_at

However, I observe that multiple rows are updated. I am certain that it's a
single query updating multiple rows, because I observed this in the EXPLAIN
output, and also configured my application to crash if multiple rows were
returned, and could reliably trigger an application crash.

Here is the EXPLAIN output from a query when two rows were returned:

Update on queued_jobs (cost=0.75..16.83 rows=1 width=120) (actual
time=3.011..67.515 rows=2 loops=1)
-> Nested Loop (cost=0.75..16.83 rows=1 width=120) (actual
time=2.974..67.458 rows=2 loops=1)
Join Filter: (queued_jobs.id = find_job.inner_id)
Rows Removed by Join Filter: 475
-> Index Scan using queued_jobs_pkey on queued_jobs
(cost=0.38..8.39 rows=1 width=80) (actual time=0.011..1.326 rows=477
loops=1)
Filter: (status = 'queued'::job_status)
Rows Removed by Filter: 1
-> Subquery Scan on find_job (cost=0.38..8.42 rows=1 width=56)
(actual time=0.137..0.138 rows=1 loops=477)
-> Limit (cost=0.38..8.41 rows=1 width=22) (actual
time=0.136..0.136 rows=1 loops=477)
-> LockRows (cost=0.38..8.41 rows=1 width=22) (actual
time=0.136..0.136 rows=1 loops=477)
-> Index Scan using find_queued_job on
queued_jobs queued_jobs_1 (cost=0.38..8.40 rows=1 width=22) (actual
time=0.134..0.135 rows=2 loops=477)
Index Cond: ((name = $1) AND (run_after <=
now()))
Filter: (status = 'queued'::job_status)

Here's the EXPLAIN output from a "normal" query that only gets one row:

Update on queued_jobs (cost=0.41..8.53 rows=1 width=120) (actual
time=3.730..3.733 rows=1 loops=1)
-> Nested Loop (cost=0.41..8.53 rows=1 width=120) (actual
time=3.688..3.690 rows=1 loops=1)
-> Subquery Scan on find_job (cost=0.00..0.08 rows=1 width=56)
(actual time=3.672..3.673 rows=1 loops=1)
-> Limit (cost=0.00..0.07 rows=1 width=22) (actual
time=3.662..3.662 rows=1 loops=1)
-> LockRows (cost=0.00..2935.47 rows=42743 width=22)
(actual time=3.661..3.661 rows=1 loops=1)
-> Seq Scan on queued_jobs queued_jobs_1
(cost=0.00..2508.04 rows=42743 width=22) (actual time=1.362..1.375 rows=5
loops=1)
Filter: ((status = 'queued'::job_status)
AND (name = $1) AND (run_after <= now()))
Rows Removed by Filter: 1
-> Index Scan using queued_jobs_pkey on queued_jobs
(cost=0.41..8.44 rows=1 width=80) (actual time=0.012..0.013 rows=1 loops=1)
Index Cond: (id = find_job.inner_id)
Filter: (status = 'queued'::job_status)

For convenience, I've posted these (and a table schema) here:
https://gist.github.com/kevinburkeshyp/ba5fdac337b3793628261de5fb26d6a3

I'm running Postgres 9.4.6 on a Mac 10.10.5, installed via Homebrew, with
the read committed isolation level. The client is a Go application with 8
concurrent database connections, using prepared statements with the
github.com/lib/pq client.

I also observe that this only seems to occur when I am simultaneously
inserting rows into the table. The inserts occur from a different Go
application, running on a separate process with a separate connection pool.

Any ideas? Maybe I don't understand SQL properly? I can reliably reproduce
this, please ping me if you'd like more information!

--
kevin

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Burke 2016-04-22 23:01:11 Re: UPDATE grabs multiple rows when it seems like it should only grab one
Previous Message Tom Lane 2016-04-22 21:44:01 Re: problem installing postgres in debian8 from debian repository