using index to speedup add not null constraints to a table

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: using index to speedup add not null constraints to a table
Date: 2024-12-16 07:07:51
Message-ID: CACJufxFiW=4k1is=F1J=r-Cx1RuByXQPUrWB331U47rSnGz+hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hi.
context: i was trying to speedup add check constraint then
Sergei Kornilov sent me a link: [1].
so i studied that thread, then tried to implement $SUBJECT.
obviously not using SPI at all.

the logic is mentioned [2]:
"""
we could try an index scan - via index_beginscan() /
index_getnext() / index_endscan() - trying to pull exactly one null
from the index, and judge whether or not there are nulls present based
only whether we get one. This would be a lot cheaper than scanning a
large table, but it needs some careful thought because of visibility
issues.
"""

Currently, if the leading key column of an index is the same as the column with
the NOT NULL constraint, then $SUBJECT applies.
so the following test case, the $SUBJECT applies:
create index t_idx_ab on t(a,b);
alter table t add constraint t1 not null a;

However, query:
alter table t add constraint t1 not null b;
$SUBEJCT does not apply, since "b" is not the leading column of the index.
(It is possible that this could be implemented. So I missed something....)

This approach will not work for partitioned tables, as ALTER TABLE ALTER COLUMN
SET EXPRESSION may trigger an index rebuild. We cannot perform an index scan
if the index will be rebuilt later. In the future, if we determine that the
column being rebuilt in the index is the same as the column to which the NOT
NULL constraint is being added, then $SUBJECT can also be applied to partitioned
tables.

based on [3], I wrote some isolation tests to address concurrency issues.
however since add not-null constraint takes ACCESS EXCLUSIVE lock,
so there is less anomaly can happen?

PERFORMANCE:
--100% bloat and zero null bloat value:
drop table if exists t \; create unlogged table t(a int, b int, c int)
\; create index t_idx_a on t(a);
insert into t select g, g+1 from generate_series(1,1_000_000) g;
delete from t;

alter table t add constraint t1 not null a;
with patch Time: 1.873 ms
master Time: 648.312 ms

comments are welcome.

[1] https://www.postgresql.org/message-id/9878.1511970441%40sss.pgh.pa.us
[2] https://postgr.es/m/CA%2BTgmoa5NKz8iGW_9v7wz%3D-%2BzQFu%3DE4SZoaTaU1znLaEXRYp-Q%40mail.gmail.com
[3] https://postgr.es/m/900056D1-32DF-4927-8251-3E0C0DC407FD%40anarazel.de

Attachment Content-Type Size
v1-0001-using-index-to-speedup-add-not-null-constraint-to.patch text/x-patch 26.1 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-12-16 07:39:06 Re: pure parsers and reentrant scanners
Previous Message Andreas 'ads' Scherbaum 2024-12-16 07:00:00 Re: Crash: invalid DSA memory alloc request