From: | Paul Jungwirth <pj(at)illuminatedcomputing(dot)com> |
---|---|
To: | Sam Gabrielsson <sam(at)movsom(dot)se> |
Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: SQL:2011 application time |
Date: | 2024-11-15 06:38:47 |
Message-ID: | 04579cbf-b134-45e1-8f2d-8c54c849c1ee@illuminatedcomputing.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/14/24 09:31, Paul Jungwirth wrote:
>> Thank you for the report! I confirmed that this is a problem. In ri_restrict we fail if any fk
>> records still match the being-changed pk, but for temporal if you're merely shrinking the pk
>> range, fk references could still wind up being valid (if you're only shrinking it a little). So we
>> need to do more work.
>
> I'm nearly done with a patch for this. I'll try to wrap it up today and get it sent this evening.
Here are patches fixing the foreign key problems, as well as the outstanding logical replication fix
(with more explanation in the commit message). There is also a commit to add the without_portion
support function (originally intended for FOR PORTION OF, but useful here too).
For NOACTION, we might as well skip ri_Check_Pk_Match, because we need to look up the details of the
referenced/referencing time periods, and we can do that in the main SQL query below.
On 11/4/24 13:16, Sam Gabrielsson wrote:
> SELECT 1
> FROM (SELECT range_agg(pkperiodatt) AS r
> FROM <pktable>
> WHERE pkatt1 = $1 [AND ...]
> AND pkperiodatt && $n) AS pktable,
> (SELECT fkperiodatt AS r
> FROM <fktable>
> WHERE fkatt1 = $1 [AND ...]
> AND fkperiodatt && $n) AS fktable
> WHERE NOT fktable.r <@ pktable.r
This query doesn't quite work, because the FK record can span multiple PK records, so finding only
PK records that overlap the changed range may miss them. That means we could still fail erroneously.
One fix is to consider *all* PK ranges with the same scalar key part, but that will get expensive.
A better fix is to consider only FK ranges after truncating them to fit within the updated PK span.
We can use the intersect operator for that. Since temporal foreign keys only support ranges &
multiranges, we can hardcode that operator lookup. (I still hope to support arbitrary types in the
future, and asking for an intersect operator isn't hard.)
Here is some SQL using that approach to find invalid references. Variables like $1 and $n are from
oldslot, and $n is the range value (like the normal FK check).
SELECT 1
FROM [ONLY] <fktable> x
WHERE $1 = x.fkatt1 [AND ...] AND $n && x.fkperiod
AND NOT coalesce((x.fkperiod * $n) <@
(SELECT range_agg(r)
FROM (SELECT y.pkperiod r
FROM [ONLY] <pktable> y
WHERE $1 = y.pkatt1 [AND ...] AND $n && y.pkperiod
FOR KEY SHARE OF y) y2), false)
FOR KEY SHARE OF x
We need the coalesce because the range_agg subquery could return no rows, and `NOT x <@ NULL` is
null. We need another subquery because FOR KEY SHARE isn't permitted in aggregate queries.
On 11/14/24 09:31, Paul Jungwirth wrote:
> The RESTRICT case needs to find the *lost* time span(s) (because it might not be the whole thing)
> and check for references to those. To do that, it calls our without_portion support function. That
> function was intended to support FOR PORTION OF, but it happens to be exactly what we need here. So
> I'm reordering the patches a bit and adjusting the documentation there.
To elaborate:
Here is what SQL:2011 says for ON UPDATE RESTRICT (4.18.3.3):
> ON UPDATE RESTRICT: any change to a referenced column in the referenced table is prohibited if
there is a matching row.
I think this requires some interpretation for temporal foreign keys. It is not talking about the
PERIOD part, but about the scalar part(s). (Recall that in the standard the PERIOD is not even a
column.) It helps to apply the principle that a temporal table behaves the same as a table with one
row per second (or millisecond or whatever). We should fail if the key changes for a referenced moment.
But we don't get a chance to find replacements in the PK table. Instead of asking whether FKs'
requirements are still fulfilled, we need to ask what was lost and then fail if we find references
to that.
There are several cases to consider: Did you change the start/end times? Did you change the scalar
key part? Did you use FOR PORTION OF?
Assume you didn't use FOR PORTION OF. Let $old and $new indicate the old and new valid-time values.
If you changed the scalar key part, then all of $old is treated as lost. $new doesn't matter. If you
didn't change it, then only `$old - $new` is treated as lost. (Note that `$old - $new` could be
empty---say you expanded the span---meaning nothing was lost here.) If there are any references
overlapping the lost part(s), we fail.
With FOR PORTION OF things are a little different. IMO even a RESTRICT key should not fail if it
references "leftovers" that weren't targeted by FOR PORTION OF. It's true that we shrink the
referenced row, and the reference now depends on the newly-inserted replacements. But conceptually
the replacements are representing the timeline that *didn't change*. Also if you take the opposite
position, then FOR PORTION OF is simply unusable with RESTRICT keys.
With that understanding, if you change the scalar key part, the lost part is $new, not $old. And if
you don't change the scalar key part, nothing is lost at all.
I could be wrong though. I didn't find anything in the standard addressing this specifically, and
I'm only working from a draft of SQL:2011. If someone with a more current copy has specific
guidance, I'd be happy to see that.
In any case, FOR PORTION OF isn't merged yet. For now I've only attached patches to fix the
outstanding problems. After rebasing tonight I ran into some tricky conflicts with my FOR PORTION OF
and PERIODs patches, so I will re-submit those later. Adapting the RESTRICT code for FOR PORTION OF,
using the plan above, is pretty simple. We just set $n to $new/empty instead of $old.
Rebased to 818119afcc.
Yours,
--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com
Attachment | Content-Type | Size |
---|---|---|
v44-0001-Fix-logical-replication-for-temporal-tables.patch | text/x-patch | 29.8 KB |
v44-0002-Add-without_portion-GiST-support-proc.patch | text/x-patch | 41.0 KB |
v44-0003-Fix-NOACTION-temporal-foreign-keys-when-the-refe.patch | text/x-patch | 23.6 KB |
v44-0004-Fix-RESTRICT-temporal-foreign-keys-when-the-refe.patch | text/x-patch | 30.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2024-11-15 06:46:52 | Re: define pg_structiszero(addr, s, r) |
Previous Message | Amit Kapila | 2024-11-15 06:20:40 | Re: Improve the error message for logical replication of regular column to generated column. |