Re: UPDATE grabs multiple rows when it seems like it should only grab one

From: Kevin Burke <burke(at)shyp(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: UPDATE grabs multiple rows when it seems like it should only grab one
Date: 2016-04-22 23:01:11
Message-ID: CAEYV4paOUT_YbrHv6SiQuNqPUpOmh9oCqyWCgqwp8NvXGvRfrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

(I should also note - the `id` column is a uuid primary key, I double
checked the table afterwards, there's no chance that two ID's were inserted
with the same value.)

On Fri, Apr 22, 2016 at 3:56 PM, Kevin Burke <burke(at)shyp(dot)com> wrote:

> Hi,
> I'm trying to write a job queue that grabs one job at a time from the
> queue. I expect that the following query should update a maximum of one row
> in the table:
>
> UPDATE queued_jobs
> SET status='in-progress',
> updated_at=now()
> FROM (
> SELECT id AS inner_id
> FROM queued_jobs
> WHERE status='queued'
> AND name = $1
> AND run_after <= now()
> LIMIT 1
> FOR UPDATE
> ) find_job
> WHERE queued_jobs.id = find_job.inner_id
> AND status='queued'
> RETURNING id,
> name,
> attempts,
> run_after,
> expires_at,
> status,
> data,
> created_at,
> updated_at
>
> However, I observe that multiple rows are updated. I am certain that it's
> a single query updating multiple rows, because I observed this in the
> EXPLAIN output, and also configured my application to crash if multiple
> rows were returned, and could reliably trigger an application crash.
>
> Here is the EXPLAIN output from a query when two rows were returned:
>
> Update on queued_jobs (cost=0.75..16.83 rows=1 width=120) (actual
> time=3.011..67.515 rows=2 loops=1)
> -> Nested Loop (cost=0.75..16.83 rows=1 width=120) (actual
> time=2.974..67.458 rows=2 loops=1)
> Join Filter: (queued_jobs.id = find_job.inner_id)
> Rows Removed by Join Filter: 475
> -> Index Scan using queued_jobs_pkey on queued_jobs
> (cost=0.38..8.39 rows=1 width=80) (actual time=0.011..1.326 rows=477
> loops=1)
> Filter: (status = 'queued'::job_status)
> Rows Removed by Filter: 1
> -> Subquery Scan on find_job (cost=0.38..8.42 rows=1 width=56)
> (actual time=0.137..0.138 rows=1 loops=477)
> -> Limit (cost=0.38..8.41 rows=1 width=22) (actual
> time=0.136..0.136 rows=1 loops=477)
> -> LockRows (cost=0.38..8.41 rows=1 width=22)
> (actual time=0.136..0.136 rows=1 loops=477)
> -> Index Scan using find_queued_job on
> queued_jobs queued_jobs_1 (cost=0.38..8.40 rows=1 width=22) (actual
> time=0.134..0.135 rows=2 loops=477)
> Index Cond: ((name = $1) AND (run_after <=
> now()))
> Filter: (status = 'queued'::job_status)
>
> Here's the EXPLAIN output from a "normal" query that only gets one row:
>
> Update on queued_jobs (cost=0.41..8.53 rows=1 width=120) (actual
> time=3.730..3.733 rows=1 loops=1)
> -> Nested Loop (cost=0.41..8.53 rows=1 width=120) (actual
> time=3.688..3.690 rows=1 loops=1)
> -> Subquery Scan on find_job (cost=0.00..0.08 rows=1 width=56)
> (actual time=3.672..3.673 rows=1 loops=1)
> -> Limit (cost=0.00..0.07 rows=1 width=22) (actual
> time=3.662..3.662 rows=1 loops=1)
> -> LockRows (cost=0.00..2935.47 rows=42743 width=22)
> (actual time=3.661..3.661 rows=1 loops=1)
> -> Seq Scan on queued_jobs queued_jobs_1
> (cost=0.00..2508.04 rows=42743 width=22) (actual time=1.362..1.375 rows=5
> loops=1)
> Filter: ((status = 'queued'::job_status)
> AND (name = $1) AND (run_after <= now()))
> Rows Removed by Filter: 1
> -> Index Scan using queued_jobs_pkey on queued_jobs
> (cost=0.41..8.44 rows=1 width=80) (actual time=0.012..0.013 rows=1 loops=1)
> Index Cond: (id = find_job.inner_id)
> Filter: (status = 'queued'::job_status)
>
> For convenience, I've posted these (and a table schema) here:
> https://gist.github.com/kevinburkeshyp/ba5fdac337b3793628261de5fb26d6a3
>
> I'm running Postgres 9.4.6 on a Mac 10.10.5, installed via Homebrew, with
> the read committed isolation level. The client is a Go application with 8
> concurrent database connections, using prepared statements with the
> github.com/lib/pq client.
>
> I also observe that this only seems to occur when I am simultaneously
> inserting rows into the table. The inserts occur from a different Go
> application, running on a separate process with a separate connection pool.
>
> Any ideas? Maybe I don't understand SQL properly? I can reliably reproduce
> this, please ping me if you'd like more information!
>
> --
> kevin
>

--
kevin

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2016-04-22 23:06:00 Re: UPDATE grabs multiple rows when it seems like it should only grab one
Previous Message Kevin Burke 2016-04-22 22:56:17 UPDATE grabs multiple rows when it seems like it should only grab one