Re: SQL:2011 application time

From: Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2024-01-06 00:19:53
Message-ID: 609ede69-49fb-4c45-acc8-544b72d6d5bb@illuminatedcomputing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Getting caught up on reviews from November and December:

On 11/19/23 22:57, jian he wrote:
>
> I believe the following part should fail. Similar tests on
> src/test/regress/sql/generated.sql. line begin 347.
>
> drop table if exists gtest23a,gtest23x cascade;
> CREATE TABLE gtest23a (x int4range, y int4range,
> CONSTRAINT gtest23a_pk PRIMARY KEY (x, y WITHOUT OVERLAPS));
> CREATE TABLE gtest23x (a int4range, b int4range GENERATED ALWAYS AS
> ('empty') STORED,
> FOREIGN KEY (a, PERIOD b ) REFERENCES gtest23a(x, PERIOD y) ON UPDATE
> CASCADE); -- should be error?

Okay, I've added a restriction for temporal FKs too. But note this will
change once the PERIODs patch (the last one here) is finished. When the
generated column is for a PERIOD, there will be logic to "reroute" the
updates to the constituent start/end columns instead.

> begin;
> drop table if exists fk, pk cascade;
> CREATE TABLE pk (id int4range, valid_at int4range,
> CONSTRAINT pk_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
> );
> CREATE TABLE fk (
> id int4range,valid_at tsrange, parent_id int4range,
> CONSTRAINT fk FOREIGN KEY (parent_id, valid_at)
> REFERENCES pk
> );
> rollback;
> --
> the above query will return an error: number of referencing and
> referenced columns for foreign key disagree.
> but if you look at it closely, primary key and foreign key columns both are two!
> The error should be saying valid_at should be specified with "PERIOD".

Ah okay, thanks for the clarification! This is tricky because the user
left out the PERIOD on the fk side, and left out the entire pk side, so
those columns are just implicit. So there is no PERIOD anywhere.
But I agree that if the pk has WITHOUT OVERLAPS, we should expect a
corresponding PERIOD modifier on the fk side and explain that that's
what's missing. The attached patches include that.

> I found out other issues in v18.
> I first do `git apply` then `git diff --check`, there is a white
> space error in v18-0005.

Fixed, thanks!

> You also need to change update.sgml and delete.sgml <title>Outputs</title> part.
> Since at most, it can return 'UPDATE 3' or 'DELETE 3'.

This doesn't sound correct to me. An UPDATE or DELETE can target many
rows. Also I don't think the inserted "leftovers" should be included in
these counts. They represent the rows updated/deleted.

> --the following query should work?
> drop table pk;
> CREATE table pk(a numrange PRIMARY key,b text);
> insert into pk values('[1,10]');
> create or replace function demo1() returns void as $$
> declare lb numeric default 1; up numeric default 3;
> begin
> update pk for portion of a from lb to up set b = 'lb_to_up';
> return;
> end
> $$ language plpgsql;
> select * from demo1();

Hmm this is a tough one. It is correct that the `FROM __ TO __` values cannot be column references.
They are computed up front, not per row. One reason is they are used to search the table. In fact
the standard basically allows nothing but literal strings here. See section 14.14, page 971 then
look up <point in time> on page 348 and <datetime value expression> on page 308. The most
flexibility you get is you can add/subtract an interval to the datetime literal. We are already well
past that by allowing expressions, (certain) functions, parameters, etc.

OTOH in your plpgsql example they are not really columns. They just get represented as ColumnRefs
and then passed to transformColumnRef. I'm surprised plpgsql does it that way. As a workaround you
could use `EXECUTE format(...)`, but I'd love to make that work as you show instead. I'll keep
working on this one but it's not done yet. Perhaps I can move the restriction into
analysis/planning. If anyone has any advice it'd be welcome.

On 12/6/23 05:22, jian he wrote:
> this TODO:
> * TODO: It sounds like FOR PORTION OF might need to do something here too?
> based on comments on ExprContext. I refactor a bit, and solved this TODO.

The patch looks wrong to me. We need to range targeted by `FROM __
TO __` to live for the whole statement, not just one tuple (see just
above). That's why it gets computed in the Init function node.

I don't think that TODO is needed anymore at all. Older versions of the
patch had more expressions besides this one, and I think it was those I
was concerned about. So I've removed the comment here.

> tring to the following TODO:
> // TODO: Need to save context->mtstate->mt_transition_capture? (See
> comment on ExecInsert)
>
> but failed.
> I also attached the trial, and also added the related test.
>
> You can also use the test to check portion update with insert trigger
> with "referencing old table as old_table new table as new_table"
> situation.

Thank you for the test case! This is very helpful. So the problem is
`referencing new table as new_table` gets lost. I don't have a fix yet
but I'll work on it.

On 12/11/23 00:31, jian he wrote:
> - false); /* quiet */
> + false); /* quiet */
>
> Is the above part unnecessary?

Good catch! Fixed.

> I am confused. so now I only apply v19, 0001 to 0003.
> period_to_range function never used. maybe we can move this part to
> 0005-Add PERIODs.patch?
> Also you add change in Makefile in 0003, meson.build change in 0005,
> better put it on in 0005?

You're right, those changes should have been in the PERIODs patch. Moved.

> +/*
> + * We need to handle this shift/reduce conflict:
> + * FOR PORTION OF valid_at FROM INTERVAL YEAR TO MONTH TO foo.
> + * This is basically the classic "dangling else" problem, and we want a
> + * similar resolution: treat the TO as part of the INTERVAL, not as part of
> + * the FROM ... TO .... Users can add parentheses if that's a problem.
> + * TO just needs to be higher precedence than YEAR_P etc.
> + * TODO: I need to figure out a %prec solution before this gets committed!
> + */
> +%nonassoc YEAR_P MONTH_P DAY_P HOUR_P MINUTE_P
> +%nonassoc TO
>
> this part will never happen?
> since "FROM INTERVAL YEAR TO MONTH TO"
> means "valid_at" will be interval range data type, which does not exist now.

It appears still needed to me. Without those lines I get 4 shift/reduce
conflicts. Are you seeing something different? Or if you have a better
solution I'd love to add it. I definitely need to fix this before that
patch gets applied.

> for all the refactor related to ri_PerformCheck, do you need (Datum) 0
> instead of plain 0?

Casts added.

> + <para>
> + If the table has a range column or
> + <link linkend="ddl-periods-application-periods"><literal>PERIOD</literal></link>,
> + you may supply a <literal>FOR PORTION OF</literal> clause, and
> your delete will
> + only affect rows that overlap the given interval. Furthermore, if
> a row's span
>
>
https://influentialpoints.com/Training/basic_statistics_ranges.htm#:~:text=A%20range%20is%20two%20numbers,or%20the%20difference%20between%20them
> So "range" is more accurate than "interval"?

I don't think we should be using R to define the terms "range" and
"interval", which both already have meanings in Postgres, SQL, and the
literature for temporal databases. But I'm planning to revise the docs'
terminology here anyway. Some temporal database texts use "interval"
in this sense, and I thought it was a decent term to mean "range or
PERIOD". But now we need something to mean "range or multirange or
custom type or PERIOD". Actually "portion" seems like maybe the best
term, since the SQL syntax `FOR PORTION OF` reinforces that term. If you
have suggestions I'm happy for ideas.

> +/* ----------
> + * ForPortionOfState()
> + *
> + * Copies a ForPortionOfState into the current memory context.
> + */
> +static ForPortionOfState *
> +CopyForPortionOfState(ForPortionOfState *src)
> +{
> + ForPortionOfState *dst = NULL;
> + if (src) {
> + MemoryContext oldctx;
> + RangeType *r;
> + TypeCacheEntry *typcache;
> +
> + /*
> + * Need to lift the FOR PORTION OF details into a higher memory context
> + * because cascading foreign key update/deletes can cause triggers to fire
> + * triggers, and the AfterTriggerEvents will outlive the FPO
> + * details of the original query.
> + */
> + oldctx = MemoryContextSwitchTo(TopTransactionContext);
>
> should it be "Copy a ForPortionOfState into the TopTransactionContext"?

You're right, the other function comments here use imperative mood. Changed.

New patches attached, rebased to 43b46aae12. I'll work on your feedback from Jan 4 next. Thanks!

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

Attachment Content-Type Size
v22-0001-Add-stratnum-GiST-support-function.patch text/x-patch 18.8 KB
v22-0002-Add-temporal-PRIMARY-KEY-and-UNIQUE-constraints.patch text/x-patch 77.3 KB
v22-0003-Add-GiST-referencedagg-support-func.patch text/x-patch 9.9 KB
v22-0004-Add-temporal-FOREIGN-KEYs.patch text/x-patch 147.6 KB
v22-0005-Add-multi-range_without_portion-proc-operator.patch text/x-patch 17.1 KB
v22-0006-Add-UPDATE-DELETE-FOR-PORTION-OF.patch text/x-patch 153.3 KB
v22-0007-Add-CASCADE-SET-NULL-SET-DEFAULT-for-temporal-fo.patch text/x-patch 112.0 KB
v22-0008-Add-PERIODs.patch text/x-patch 131.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-01-06 00:44:00 Re: remaining sql/json patches
Previous Message Robert Haas 2024-01-06 00:03:32 Re: Fix bogus Asserts in calc_non_nestloop_required_outer