From: | Glen Mailer <glen(at)geckoboard(dot)com> |
---|---|
To: | Zhang Mingli <zmlpostgres(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Possible bug with SKIP LOCKED behaviour |
Date: | 2022-09-29 08:50:50 |
Message-ID: | CAHvdy4VGD+Jk2hc=m=iYL4amAohcUBoPzYcChM1mfHtPQ9AbWw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello
With SKIP LOCKED, any selected rows that cannot be immediately locked are
> skipped. Skipping locked rows provides an inconsistent view of the data, so
> this is not suitable for general purpose work, but can be used to avoid
> lock contention with multiple consumers accessing a queue-like table.
>
Yes, I am specifically aiming to avoid lock contention with multiple
consumers accessing a queue-like table, and I'm seeing the same row being
retrieved my multiple workers
And a golang script is not convenient for hackers to reproduce. Could you
> provide some steps to produce the bug stably if it really was ?
>
Reproducing requires running a transaction with queries dependent on the
results of earlier queries, and then running a number of these transactions
concurrently, and then repeating the test until the unexpected result
happens. Currently I'm doing 20 concurrent transactions, and I find that if
I repeat the test 100 times I tend to get between zero and 3 failures.
What would be a more convenient way for me to provide this for reproduction?
Thanks
Glen
On Thu, 29 Sept 2022 at 03:41, Zhang Mingli <zmlpostgres(at)gmail(dot)com> wrote:
> Hi,
>
> On Sep 29, 2022, 00:56 +0800, Glen Mailer <glen(at)geckoboard(dot)com>, wrote:
>
> Hello everyone
>
> I believe I've run into a bug in the behaviour of SKIP LOCKED, where I
> have a program that implements a queue with concurrent workers SELECTing
> work from some shared tables.
>
> The code in question does a LEFT JOIN across two tables with a FOR UPDATE
> on the left table and a SKIP LOCKED clause, and then UPDATEs or INSERTs
> rows into the table on right side of the JOIN in a way that leads to
> subsequent executions of the same query to no longer match those rows.
> However, when run concurrently I'm seeing the same row be selected by
> multiple workers - which shouldn't be possible based on my understanding of
> the relevant semantics of these operations. Perhaps I'm just holding it
> wrong, but I would have expected the FOR UPDATE lock on the left table to
> be sufficient to avoid overlapping results.
>
> I have extracted a fairly minimal reproducing case from our production
> code, which includes some Go code as a test harness to run the queries
> concurrently enough to demonstrate the problem - this can be found at
> https://github.com/glenjamin/postgres-skip-locked-surprise
> I wasn't sure how much detail from that reproducing case to repeat in this
> email, so I've only gone with an outline of the observed and expected
> behaviour - but I can try and add more detail to this thread if desired
>
> Cheers
> Glen
>
> According to doc:
>
> With SKIP LOCKED, any selected rows that cannot be immediately locked are
> skipped. Skipping locked rows provides an inconsistent view of the data, so
> this is not suitable for general purpose work, but can be used to avoid
> lock contention with multiple consumers accessing a queue-like table.
>
> this can be found at
> https://github.com/glenjamin/postgres-skip-locked-surprise
>
> And a golang script is not convenient for hackers to reproduce. Could you
> provide some steps to produce the bug stably if it really was ?
>
> Regards,
> Zhang Mingli
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Ivan Ivanov | 2022-09-29 11:59:03 | Re: Function modification visibility in parallel connection |
Previous Message | PG Bug reporting form | 2022-09-29 08:41:30 | BUG #17624: Creating database is non-ending execution. |