Re: NOT ENFORCED constraint feature

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Joel Jacobson <joel(at)compiler(dot)org>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>
Subject: Re: NOT ENFORCED constraint feature
Date: 2025-02-13 05:34:10
Message-ID: CAExHW5uNhnBdgncoFFq2hjf3ykAMtebbRjGxr3SxTf+Nb0uO9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 12, 2025 at 8:15 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> On 12.02.25 12:13, Álvaro Herrera wrote:
> > On 2025-Feb-12, Ashutosh Bapat wrote:
> >
> >> I have been asking a different question: What's the use of
> >> not-enforced constraints if we don't allow VALID, NOT ENFORCED state
> >> for them?
> >
> > That's a question for the SQL standards committee. They may serve
> > schema documentation purposes, for example.
> > https://www.postgresql.eu/events/pgconfeu2024/schedule/session/5677-exploring-postgres-databases-with-graphs/
> >
> >> OTOH, consider an application which "knows" that the constraint is
> >> valid for the data (either because of checks at application level, or
> >> because the data was replicated from some other system where the
> >> cosntraints were applied). It's a natural ask to use the constraints
> >> for, say optimization, but don't take unnecessary overhead of
> >> validating them. VALID, NOT ENFORCED state helps in such a scenario.
> >> Of course an application can misuse it (just like stable marking on a
> >> function), but well ... they will be penalised for their misuse.
> >
> > I disagree that we should see a VALID NOT ENFORCED constraint as one
> > that can be used for query optimization purposes. This is only going to
> > bring users pain, because it's far too easy to misuse and they will get
> > wrong query results, possibly without knowing for who knows how long.
>
> I've been digging into the ISO archives for some more background on the
> intended meaning of this feature.
>
> Result: "NOT ENFORCED" just means "off" or "disabled", "could contain
> anything". You can use this to do data loads, or schema surgery, or
> things like that. Or just if you want it for documentation.

Hmm, so one can convert an enforced constraint to a not-enforced
constraint, load the data or make changes and then enforce it again.
Makes sense.

>
> This idea that a not-enforced constraint should contain valid data
> anyway is not supported by anything I could find written down. I've
> heard that in discussions, but those could have been speculations.
>
> (I still think that could be a feature, but it's clearly not this one,
> at least not in its default state.)

Thanks for the background.

>
> So considering that, I think a three-state system makes more sense.
> Something like:
>
> 1) NOT ENFORCED -- no data is checked
> 2) NOT VALID -- existing data is unchecked, new data is checked
> 3) ENFORCED -- all data is checked
>
> Transitions:
>
> (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2)

Per your notation, this means the the constraint is not enforced but
new data is checked - that seems a contradiction, how would we check
the data when the constraint is not being enforced. Or do you suggest
that we convert a NOT ENFORCED constraint to ENFORCED as a result of
converting it to NOT VALID?

> (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ] -> (3)

Seems ok.

> (2) - [ ALTER TABLE ... VALIDATE CONSTRAINT ... ] -> (3)

As a result of this a not enforced constraint would turn into an
enforced constraint. The user might have intended to just validate the
data but not enforce it to avoid paying price for the checks on new
data.

> (2|3) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ] -> (1)

Looks fine.

> (3) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2)
>

This too seems ok assuming the constraint would remain enforced.

I think, what you intend to say is clearer with 4 state system {NE, E}
* {NV, V} = {(NE, NV), (NE, V), (E, NV), (E, V)} where (NE, V) is
unreachable. Let's name them S1, S2, S3, S4 respectively.

S1 -> [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> S1 - noop
S3 -> [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> S3 - noop
S4 -> [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> S3

S1->[ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ]->S4
S3->[ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ]->S3 - noop
S4->[ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ]->S4 - noop

S1->[ ALTER TABLE ... VALIDATE CONSTRAINT ... ]->S1 - but this is not
noop - the existing data gets validated but no change happens to the
state of the constraint - it is not enforced on the future data and
it's not considered valid. This gives opportunity to the user to just
validate the existing data but not enforce the constraint on new data
thus avoiding some computation on the new data. Of course we will have
to update the documentation to clearly specify the result. I think
VALIDATE CONSTRAINT is postgresql extension so we are free to
interpret it in the context of ENFORCED feature.
S3->[ ALTER TABLE ... VALIDATE CONSTRAINT ... ]->S4
S4->[ ALTER TABLE ... VALIDATE CONSTRAINT ... ]->S4 - noop

S1-[ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ]->S1 - noop
S3-[ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ]->S1
S4-[[ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ]->S1

Notice that there are no edges to and from S2.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2025-02-13 05:48:28 Re: Address the bug in 041_checkpoint_at_promote.pl
Previous Message Peter Smith 2025-02-13 04:59:45 DOCS - Question about pg_sequences.last_value notes