Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET

From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
Date: 2021-12-17 06:40:45
Message-ID: CAM-w4HOT2CHWdRk6uPDG-srs6KygUzkfLWmi=rJ0Sc7Hr32myg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 16 Dec 2021 at 22:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> * If the sort order is underspecified, or you omit ORDER BY
> entirely, then it's not clear which rows will be operated on.
> The LIMIT might stop after just some of the rows in a peer
> group, and you can't predict which ones.

Meh, that never seemed very compelling to me. I think that's on the
user and there are *lots* of cases where the user can easily know
enough extra context to know that's what she wants. In particular I've
often wanted to delete one of two identical records and it would have
been really easy to just do a DELETE LIMIT 1. I know there are ways to
do it but it's always seemed like unnecessary hassle when there was a
natural way to express it.

> * UPDATE/DELETE necessarily involve the equivalent of SELECT
> FOR UPDATE, which may cause the rows to be ordered more
> surprisingly than you expected, ie the sort happens *before*
> rows are replaced by their latest versions, which might have
> different sort keys.

This... is a real issue. Or is it? Just to be clear I think what
you're referring to is a case like:

INSERT INTO t values (1),(2)

In another session: BEGIN; UPDATE t set c=0 where c=2

DELETE FROM t ORDER BY c ASC LIMIT 1
<blocks>

In other session: COMMIT

Which row was deleted? In this case it was the row where c=1. Even
though the UPDATE reported success (i.e. 1 row updated) so presumably
it happened "before" the delete. The delete saw the ordering from
before it was blocked and saw the ordering with c=1, c=2 so apparently
it happened "before" the update.

There are plenty of other ways to see the same surprising
serialization failure. If instead of a DELETE you did an UPDATE there
are any number of functions or other features that have been added
which can expose the order in which the updates happened.

The way to solve those cases would be to use serializable isolation
(or even just repeatable read in this case). Wouldn't that also solve
the DELETE serialization failure too?

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2021-12-17 06:42:58 Re: Emit a warning if the extension's GUC is set incorrectly
Previous Message Kyotaro Horiguchi 2021-12-17 06:40:10 Re: [PATCH] Accept IP addresses in server certificate SANs