Re: SQL:2011 application time

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2024-07-25 05:57:00
Message-ID: CACJufxHyj4HDjtAjz5fNjoKkadiuxGLA6KQUtSf_XpHsx9JRwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 24, 2024 at 12:08 AM Paul Jungwirth
<pj(at)illuminatedcomputing(dot)com> wrote:
>
> On 7/18/24 11:39, Paul Jungwirth wrote:
> > So I swapped in the &&& patch, cleaned it up, and added tests. But something is wrong. After I get
> > one failure from an empty, I keep getting failures, even though the table is empty:
> >
> > regression=# truncate temporal_rng cascade;
> > NOTICE: truncate cascades to table "temporal_fk_rng2rng"
> > TRUNCATE TABLE
> > regression=# insert into temporal_rng values ('[1,2)', '[2000-01-01,2010-01-01)'); -- ok so far
> > INSERT 0 1
> > regression=# insert into temporal_rng values ('[1,2)', 'empty'); -- should fail and does
> > ERROR: range cannot be empty
> > regression=# insert into temporal_rng values ('[1,2)', '[2010-01-01,2020-01-01)'); -- uh oh
> > ERROR: range cannot be empty
> > regression=# truncate temporal_rng cascade;
> > NOTICE: truncate cascades to table "temporal_fk_rng2rng"
> > TRUNCATE TABLE
> > regression=# insert into temporal_rng values ('[1,2)', '[2000-01-01,2010-01-01)'); -- ok so far
> > INSERT 0 1
> > regression=# insert into temporal_rng values ('[1,2)', '[2010-01-01,2020-01-01)'); -- ok now
> > INSERT 0 1
> >
> > It looks like the index is getting corrupted. Continuing from the above:
> >
> > regression=# create extension pageinspect;
> > CREATE EXTENSION
> > regression=# select gist_page_items(get_raw_page('temporal_rng_pk', 0), 'temporal_rng_pk');
> > gist_page_items
> > ----------------------------------------------------------------------------
> > (1,"(0,1)",40,f,"(id, valid_at)=(""[1,2)"", ""[2000-01-01,2010-01-01)"")")
> > (2,"(0,2)",40,f,"(id, valid_at)=(""[1,2)"", ""[2010-01-01,2020-01-01)"")")
> > (2 rows)
> >
> > regression=# insert into temporal_rng values ('[1,2)', 'empty');
> > ERROR: range cannot be empty
> > regression=# select gist_page_items(get_raw_page('temporal_rng_pk', 0), 'temporal_rng_pk');
> > gist_page_items
> > ----------------------------------------------------------------------------
> > (1,"(0,1)",40,f,"(id, valid_at)=(""[1,2)"", ""[2000-01-01,2010-01-01)"")")
> > (2,"(0,2)",40,f,"(id, valid_at)=(""[1,2)"", ""[2010-01-01,2020-01-01)"")")
> > (3,"(0,3)",32,f,"(id, valid_at)=(""[1,2)"", empty)")
> > (3 rows)
>
> I realized this isn't index corruption, just MVCC. The exclusion constraint is checked after we
> update the index, which is why the row gets left behind. But it doesn't cause any wrong answers, and
> if you vacuum the table the row goes away.
>
> This also explains my confusion here:
>
> > I thought of a possible problem: this operator works great if there are already rows in the table,
> > but what if the *first row you insert* has an empty range? Then there is nothing to compare against,
> > so the operator will never be used. Right?
> >
> > Except when I test it, it still works!
>
> The first row still does a comparison because when we check the exclusion constraint, there is a
> comparison between the query and the key we just inserted. (When I say "query" I don't mean a SQL
> query, but the value used to search the index that is compared against its keys.)
>
> So I'm glad I didn't stumble on a GiST bug, but I think it means ereporting from an exclusion operator
> is not a workable approach. Failures leave behind invalid tuples, and future (valid) tuples can fail if
> we compare to those invalid tuples. Since MVCC visibility is stored in the heap, not in the index, it's
> not really accessible to us here. So far I don't have any ideas to rescue this idea, even though I like
> it a lot. So I will go back to the executor idea we discussed at pgconf.dev.
>

another kind of crazy idea.
instead of "ERROR: range cannot be empty"
let it return true.
so 'empty'::int4range &&& 'empty'; return true.

one downside is, if your first row period column is empty, then you
can not insert any new rows
that have the same non-period key column.

for example:
drop table if exists temporal_rng1 ;
CREATE TABLE temporal_rng1 (
id int4range,
valid_at int4range,
CONSTRAINT temporal_rng1_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);
insert into temporal_rng1 values ('[1,2]', 'empty');

In this context, now, you cannot insert any new rows whose id is equal
to '[1,2]'.

----but if your first row is not empty, then you won't have empty.
truncate temporal_rng1;
insert into temporal_rng1 values ('[1,2]', '[3,4]');

then
insert into temporal_rng1 values ('[1,2]', 'empty'); --will fail.

In summary, you will have exactly one empty, no other values (if the
first row is empty).
or you will have values and not empty values at all.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-07-25 06:34:08 RE: Conflict detection and logging in logical replication
Previous Message Konstantin Berkaev 2024-07-25 05:42:34 Re: Support logical replication of DDLs