Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
Date: 2021-12-20 10:45:42
Message-ID: 20211220194542.7c5eb379d4baf0837ab8c88f@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Greg,

On Fri, 17 Dec 2021 01:40:45 -0500
Greg Stark <stark(at)mit(dot)edu> wrote:

> On Thu, 16 Dec 2021 at 22:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > * If the sort order is underspecified, or you omit ORDER BY
> > entirely, then it's not clear which rows will be operated on.
> > The LIMIT might stop after just some of the rows in a peer
> > group, and you can't predict which ones.
>
> Meh, that never seemed very compelling to me. I think that's on the
> user and there are *lots* of cases where the user can easily know
> enough extra context to know that's what she wants. In particular I've
> often wanted to delete one of two identical records and it would have
> been really easy to just do a DELETE LIMIT 1. I know there are ways to
> do it but it's always seemed like unnecessary hassle when there was a
> natural way to express it.

Out of curiosity, could you please tell me the concrete situations
where you wanted to delete one of two identical records?

> > * UPDATE/DELETE necessarily involve the equivalent of SELECT
> > FOR UPDATE, which may cause the rows to be ordered more
> > surprisingly than you expected, ie the sort happens *before*
> > rows are replaced by their latest versions, which might have
> > different sort keys.
>
> This... is a real issue. Or is it? Just to be clear I think what
> you're referring to is a case like:
>
> INSERT INTO t values (1),(2)
>
> In another session: BEGIN; UPDATE t set c=0 where c=2
>
> DELETE FROM t ORDER BY c ASC LIMIT 1
> <blocks>
>
> In other session: COMMIT
>
> Which row was deleted? In this case it was the row where c=1. Even
> though the UPDATE reported success (i.e. 1 row updated) so presumably
> it happened "before" the delete. The delete saw the ordering from
> before it was blocked and saw the ordering with c=1, c=2 so apparently
> it happened "before" the update.

When I tried it using my patch, the DELETE deletes the row where c=1
as same as above, but it did not block. Is that the result of an
experiment using my patch or other RDBMS like MySQL?

> There are plenty of other ways to see the same surprising
> serialization failure. If instead of a DELETE you did an UPDATE there
> are any number of functions or other features that have been added
> which can expose the order in which the updates happened.
>
> The way to solve those cases would be to use serializable isolation
> (or even just repeatable read in this case). Wouldn't that also solve
> the DELETE serialization failure too?

Do you mean such serialization failures would be avoided in
SERIALIZABLE or REPATABLE READ by aborting the transaction, such
serialization failures may be accepted in READ COMMITTED?

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-12-20 11:11:04 Re: Confused comment about drop replica identity index
Previous Message Amit Kapila 2021-12-20 10:29:59 Re: row filtering for logical replication