From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Gabriele Bartolini <gabriele(dot)bartolini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it> |
Subject: | Re: [PATCH] Support for foreign keys with arrays |
Date: | 2011-11-17 04:50:27 |
Message-ID: | CAFj8pRCEhyC_PK3jJHE5ZS8Qo06+OBpbcjAkELW0skOgewgA5w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello
2011/11/17 Noah Misch <noah(at)leadboat(dot)com>:
> Hi Gabriele,
>
> On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote:
>> CREATE TABLE pt (
>> id INTEGER PRIMARY KEY,
>> ...
>> );
>>
>> CREATE TABLE ft (
>> id SERIAL PRIMARY KEY,
>> pids INTEGER[] REFERENCES pt,
>> ...
>> );
>
> This seems useful.
>
will be supported situation
CREATE TABLE main(
id int[] PRIMARY KEY,
...
);
CREATE TABLE child(
main_id int[] REFERENCES main(id),
??
Regards
Pavel Stehule
> I'm assuming the SQL spec says nothing about a feature like this?
>
>> This patch is for discussion and has been built against HEAD.
>> It compiles and passes all regressions tests (including specific ones -
>> see the src/test/regress/sql/foreign_key.sql file).
>> Empty arrays, multi-dimensional arrays, duplicate elements and NULL
>> values are allowed.
>
> With this patch, RI_Initial_Check does not detect a violation in an array that
> contains at least one conforming element:
>
> BEGIN;
> CREATE TABLE parent (c int PRIMARY KEY);
> CREATE TABLE child (c int[]);
> INSERT INTO parent VALUES (1);
> INSERT INTO child VALUES ('{3,1,2}');
> ALTER TABLE child ADD FOREIGN KEY (c) REFERENCES parent; -- should error
> INSERT INTO child VALUES ('{3,1,2}'); -- does error, as expected
> ROLLBACK;
>
> The error message DETAIL on constraint violation would benefit from
> array-FK-specific language. Example of current message:
>
> ERROR: insert or update on table "child" violates foreign key constraint "child_c_fkey"
> DETAIL: Key (c)=({3,1,2}) is not present in table "parent".
>
>
> The patch is missing a change to the code that does FK=FK checks when a user
> updates the FK side:
>
> \set VERBOSITY verbose
> BEGIN;
> CREATE TABLE parent (c int PRIMARY KEY);
> CREATE TABLE child (c int[] REFERENCES parent);
> INSERT INTO parent VALUES (1);
> INSERT INTO child VALUES ('{1,1}');
> COMMIT;
> -- ERROR: XX000: no conversion function from integer[] to integer
> -- LOCATION: ri_HashCompareOp, ri_triggers.c:4097
> UPDATE child SET c = '{1,1}';
> DROP TABLE parent, child;
> COMMIT;
>
> Please audit each ri_triggers.c entry point for further problems like this.
>
>> We had to enforce some limitations, due to the lack (yet) of a clear and
>> universally accepted behaviour and strategy.
>> For example, consider the ON DELETE action on the above tables: in case
>> of delete of a record in the 'pt' table, should we remove the whole row
>> or just the values from the array?
>> We hope we can start a discussion from here.
>
> Removing values from the array seems best to me. There's no doubt about what
> ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual
> array elements is consistent with that. It's less clear for SET NULL, but I'd
> continue with a per-element treatment. I'd continue to forbid SET DEFAULT.
>
> However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows:
> http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local
> So, perhaps the behavior needs to be user-selectable.
>
>> Current limitations:
>>
>> * Only arrays of the same type as the primary key in the referenced
>> table are supported
>
> This is understandable for a WIP, but the final patch will need to use our
> existing, looser foreign key type match requirement.
>
>> * multi-column foreign keys are not supported (only single column)
>
> Any particular reason for this?
>
>> *** a/doc/src/sgml/ddl.sgml
>> --- b/doc/src/sgml/ddl.sgml
>> ***************
>> *** 764,769 **** CREATE TABLE order_items (
>> --- 764,796 ----
>> the last table.
>> </para>
>>
>> + <para>
>> + Another option you have with foreign keys is to use a referencing column
>> + which is an array of elements with the same type as the referenced column
>> + in the related table. This feature, also known as <firstterm>foreign key arrays</firstterm>,
>> + is described in the following example:
>
> Please wrap your documentation paragraphs.
>
>> *** a/src/backend/commands/tablecmds.c
>> --- b/src/backend/commands/tablecmds.c
>> ***************
>> *** 5705,5710 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
>> --- 5705,5735 ----
>> Oid ffeqop;
>> int16 eqstrategy;
>>
>> + /* Check if foreign key is an array of primary key types */
>> + const bool is_foreign_key_array = (fktype == get_array_type (pktype));
>
> We don't declare non-pointer, local variables "const". Also, [not positive on
> this one] when an initial assignment requires a comment, declare the variable
> with no assignment and no comment. Then, assign it later with the comment.
> This keeps the per-block declarations packed together.
>
>
> This test wrongly rejects FK types that are domains over the array type:
>
> BEGIN;
> CREATE TABLE parent (c int PRIMARY KEY);
> CREATE DOMAIN intarrdom AS int[];
> CREATE TABLE child (c intarrdom REFERENCES parent);
> ROLLBACK;
>
>> +
>> + /* Enforce foreign key array restrictions */
>> + if (is_foreign_key_array)
>> + {
>> + /*
>> + * Foreign key array must not be part of a multi-column foreign key
>> + */
>> + if (is_foreign_key_array && numpks > 1)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_FOREIGN_KEY),
>> + errmsg("foreign key arrays must not be part of a multi-column foreign key")));
>> +
>> + /*
>> + * We have to restrict foreign key array to NO ACTION and RESTRICT mode
>> + * until the behaviour triggered by the other actions is clearer and well defined
>> + */
>> + if ((fkconstraint->fk_upd_action != FKCONSTR_ACTION_NOACTION && fkconstraint->fk_upd_action != FKCONSTR_ACTION_RESTRICT)
>> + || (fkconstraint->fk_del_action != FKCONSTR_ACTION_NOACTION && fkconstraint->fk_del_action != FKCONSTR_ACTION_RESTRICT))
>
> Break these lines to keep things within 78 columns. Audit the remainder of
> your changes for long lines, and break when in doubt.
>
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_FOREIGN_KEY),
>> + errmsg("NO ACTION and RESTRICT are the only supported actions for foreign key arrays")));
>
> Error message constants can remain unbroken, though.
>
>> + }
>> +
>> /* We need several fields out of the pg_opclass entry */
>> cla_ht = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclasses[i]));
>> if (!HeapTupleIsValid(cla_ht))
>> ***************
>> *** 5766,5772 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel,
>> Oid target_typeids[2];
>>
>> input_typeids[0] = pktype;
>> ! input_typeids[1] = fktype;
>> target_typeids[0] = opcintype;
>> target_typeids[1] = opcintype;
>> if (can_coerce_type(2, input_typeids, target_typeids,
>> --- 5791,5801 ----
>> Oid target_typeids[2];
>>
>> input_typeids[0] = pktype;
>> ! /* When is FKA we must use for FK the same type of PK */
>> ! if (is_foreign_key_array)
>> ! input_typeids[1] = pktype;
>> ! else
>> ! input_typeids[1] = fktype;
>> target_typeids[0] = opcintype;
>> target_typeids[1] = opcintype;
>> if (can_coerce_type(2, input_typeids, target_typeids,
>
> This is bogus; the can_coerce_type test will always pass (excluding bad cases
> of catalog inconsistency).
>
> ATAddForeignKeyConstraint should choose to make an array foreign key whenever
> the PK side is a scalar and the FK side is an array. Then, grab the element
> type of the FK side and feed that through the operator-identification logic.
>
>> *** a/src/backend/utils/adt/ri_triggers.c
>> --- b/src/backend/utils/adt/ri_triggers.c
>> ***************
>> *** 460,465 **** RI_FKey_check(PG_FUNCTION_ARGS)
>> --- 460,466 ----
>> char paramname[16];
>> const char *querysep;
>> Oid queryoids[RI_MAX_NUMKEYS];
>> + bool is_foreign_key_array = false;
>>
>> /* ----------
>> * The query string built is
>> ***************
>> *** 476,493 **** RI_FKey_check(PG_FUNCTION_ARGS)
>> {
>> Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]);
>> Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]);
>>
>> quoteOneName(attname,
>> RIAttName(pk_rel, riinfo.pk_attnums[i]));
>> sprintf(paramname, "$%d", i + 1);
>> ! ri_GenerateQual(&querybuf, querysep,
>> ! attname, pk_type,
>> ! riinfo.pf_eq_oprs[i],
>> ! paramname, fk_type);
>> querysep = "AND";
>> queryoids[i] = fk_type;
>> }
>> ! appendStringInfo(&querybuf, " FOR SHARE OF x");
>>
>> /* Prepare and save the plan */
>> qplan = ri_PlanCheck(querybuf.data, riinfo.nkeys, queryoids,
>> --- 477,524 ----
>> {
>> Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]);
>> Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]);
>> + is_foreign_key_array = (fk_type == get_array_type (pk_type));
>
> Drop the extra whitespace before the function argument list.
>
>>
>> quoteOneName(attname,
>> RIAttName(pk_rel, riinfo.pk_attnums[i]));
>> sprintf(paramname, "$%d", i + 1);
>> ! /*
>> ! * In case of an array foreign key, we check that every
>> ! * DISTINCT NOT NULL value in the array is present in the PK table.
>> ! * XXX: This works because the query is executed with LIMIT 1,
>
> I found this comment confusing, since the SQL syntax "LIMIT 1" is never used
> here. I suppose you're referring to the fact that we call into SPI with
> tcount = 1?
>
>> ! * but may not work properly with SSI (a better approach would be
>> ! * to inspect the array and skip the check in case of empty arrays).
>
> Why might serializable transactions be specially affected?
>
>> ! */
>> ! if (is_foreign_key_array)
>> ! {
>> ! appendStringInfo(&querybuf, " %s (SELECT count(*) FROM (SELECT DISTINCT UNNEST(%s)) y WHERE y IS NOT NULL)", querysep, paramname);
>> ! appendStringInfo(&querybuf, " = (SELECT count(*) FROM (SELECT 1 FROM ONLY %s y", pkrelname);
>> ! ri_GenerateQual(&querybuf, "WHERE",
>> ! attname, pk_type,
>> ! riinfo.pf_eq_oprs[i],
>> ! paramname, fk_type);
>> ! /*
>> ! * We lock for share every row in the pkreltable that is
>> ! * referenced by the array elements
>> ! */
>> ! appendStringInfo(&querybuf, " FOR SHARE OF y) z)");
>
> The resulting query performs an irrelevant sequential scan on the PK table:
>
> SELECT 1 FROM ONLY "public"."parent" x WHERE (SELECT count(*) FROM (SELECT DISTINCT UNNEST($1)) y WHERE y IS NOT NULL) = (SELECT count(*) FROM (SELECT 1 FROM ONLY "public"."parent" y WHERE "c" OPERATOR(pg_catalog.=) ANY ($1) FOR SHARE OF y) z)
>
> As you suggested with that comment above, this scan always ends after one row.
> That places a bound on the actual performance hit. However, we still read the
> one row, which may mean loading a page for nothing. At a minimum, simplify
> this query to:
>
> SELECT 1 WHERE (SELECT count(*) FROM (SELECT DISTINCT UNNEST($1)) y WHERE y IS NOT NULL) = (SELECT count(*) FROM (SELECT 1 FROM ONLY "public"."parent" y WHERE "c" OPERATOR(pg_catalog.=) ANY ($1) FOR SHARE OF y) z)
>
> That also naturally handles empty arrays against empty PK tables, which
> currently fail for me even at READ COMMITTED:
>
> BEGIN;
> CREATE TABLE parent (c int PRIMARY KEY);
> CREATE TABLE child (c int[] REFERENCES parent);
> INSERT INTO child VALUES ('{}'); -- fails wrongly
> ROLLBACK;
>
>> ! }
>> ! else
>> ! {
>> ! ri_GenerateQual(&querybuf, querysep,
>> ! attname, pk_type,
>> ! riinfo.pf_eq_oprs[i],
>> ! paramname, fk_type);
>> ! }
>> querysep = "AND";
>> queryoids[i] = fk_type;
>> }
>> ! /*
>> ! * We skip locking for share in case of foreign key arrays
>> ! * as it has been done in the inner subquery
>> ! */
>> ! if (! is_foreign_key_array)
>
> Drop the whitespace after the "!".
>
>> ! appendStringInfo(&querybuf, " FOR SHARE OF x");
>>
>> /* Prepare and save the plan */
>> qplan = ri_PlanCheck(querybuf.data, riinfo.nkeys, queryoids,
>
>> *** a/src/test/regress/expected/foreign_key.out
>> --- b/src/test/regress/expected/foreign_key.out
>> ***************
>> *** 968,978 **** drop table pktable;
>> drop table pktable_base;
>> -- 2 columns (1 table), mismatched types
>> create table pktable_base(base1 int not null, base2 int);
>> - create table pktable(ptest1 inet, ptest2 inet[], primary key(base1, ptest1), foreign key(base2, ptest2) references
>> - pktable(base1, ptest1)) inherits (pktable_base);
>> - NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pktable_pkey" for table "pktable"
>> - ERROR: foreign key constraint "pktable_base2_fkey" cannot be implemented
>> - DETAIL: Key columns "ptest2" and "ptest1" are of incompatible types: inet[] and inet.
>
> Instead of deleting this test, change the type from inet[] to something
> unrelated, like float8.
>
> Thanks,
> nm
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2011-11-17 05:03:56 | Re: ISN was: Core Extensions relocation |
Previous Message | Pavan Deolasee | 2011-11-17 04:49:58 | Re: FlexLocks |