Re: BUG #17261: FK ON UPDATE CASCADE can break referential integrity with columns of different types

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, marcus(at)cockroachlabs(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17261: FK ON UPDATE CASCADE can break referential integrity with columns of different types
Date: 2021-11-02 01:57:08
Message-ID: CAKFQuwbaU9pcb+uEmxpu=-FM2MaT8j2HYt51r9NQfdhHkL0WKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Nov 1, 2021 at 5:34 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The problem with this numeric case is that
> we round off the value while storing it into the referencing column
> --- but, for the specific values given, that results in no change in
> the referencing value so ri_KeysEqual() decides that there's no need
> to re-check the constraint. That's not an optimization I care to
> give up.

AFAICT, the only place where our documentation touches on this is
> 5.4.5. Foreign Keys, which breezily says "Of course, the number and
> type of the constrained columns need to match the number and type of
> the referenced columns." So maybe we need to improve that, but I'm
> not sure what to say instead. In view of these considerations,
> we surely shouldn't encourage using different types.
>
>
The following comes mostly from reading this thread. This is mostly just a
conversation starter and me trying to make sure I understand the problem
accurately, regardless of it's acceptability as documentation.

"""
Warning, be careful, or simply avoid, using on update cascade when the data
types involved in the foreign key are not identical.

Why?

When using update cascade there is an optimization in place that compares
the old and new values for equality, after casting them to the referencing
table's data type, and skips performing the update if the value did not
change. This is problematic when multiple values in the referenced table's
data type map to a single value in the referencing table's data type.
e.g., both 1.00 and 1.45 numeric(10,2) are equal to 1 numeric(10,0). Thus
the optimization check will conclude that changing the referenced value
from 1.00 to 1.45 is a no-op with respect to the referencing row's value of
1, leaving the system in an inconsistent state.
"""

The fundamental issue here seems to be that normal FK usage results in
referencing-to-referenced lookups which are one-to-one. But the on update
usage results in a referenced-to-referencing lookup which ends up being
many-to-one (values, not rows). If that lookup ends up being one-to-one,
even if the data types are different, then that difference in data types
won't matter. In this example it is many-to-one.

I feel like the optimization being done, and the comparison between old and
new being performed, is amenable to change to detect this case and fail the
update command. Though I need to get more into the weeds to actually
defend that feeling and even offer an approach. But hopefully this helps
in the meantime.

Note: I haven't tried to reason out if on delete cascade has an issue here.

David J.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message egashira.yusuke@fujitsu.com 2021-11-02 02:56:57 RE: BUG #17254: Crash with 0xC0000409 in pg_stat_statements when pg_stat_tmp\pgss_query_texts.stat exceeded 2GB.
Previous Message Sandeep Thakkar 2021-11-02 00:50:42 Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data