Re: Idea: Avoid JOINs by using path expressions to follow FKs

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Vik Fearing" <vik(at)postgresfriends(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Idea: Avoid JOINs by using path expressions to follow FKs
Date: 2021-03-30 07:47:07
Message-ID: 4affc3c5-8904-40b3-9a6d-6440ee7cc81a@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 30, 2021, at 08:03, Pavel Stehule wrote:
> Maybe there were no technical problems. Just this technology was coming at a bad time. The people who needed (wanted) OOP access to data got the Hibernate, and there was no necessity to do this work on SQL level. In this time, there was possibility to use GUI for databases, and in this time there were a lot of graphic query designers.

Thanks for giving this perspective. It seems like a likely explanation. In the ORM camp, SQL is merely a low-level language compilation target, not a language humans primarily write code in.

> I don't like the idea of foreign key constraint names - it doesn't look comfortable to me. I don't say it is a bad idea, but it is not SQL, and I am not sure if it needs more or less work than explicitly to write PK=FK.

I agree, it's not very comfortable. Maybe we can think of ways to improve the comfort?

Here are two such ideas:

Idea #1
=======

Initial semi-automated script-assisted renaming of existing foreign keys.

In my experiences, multiple foreign keys per primary table is quite common,
but not multiple foreign keys referencing the same foreign table from the same primary table.

If so, then a script can be written to rename most existing foreign keys:

--
-- Script to rename foreign keys to the name of the foreign table.
-- Tables with multiple foreign keys referencing the same foreign table are skipped.
--
DO
$_$
DECLARE
sql_cmd text;
BEGIN
FOR sql_cmd IN
SELECT * FROM
(
SELECT
format
(
'ALTER TABLE %I.%I RENAME CONSTRAINT %I TO %I;',
conrel_nsp.nspname,
conrel.relname,
pg_constraint.conname,
confrel.relname
) AS sql_cmd,
COUNT(*) OVER (PARTITION BY pg_constraint.conrelid, pg_constraint.confrelid)
AS count_foreign_keys_to_same_table
FROM pg_constraint
JOIN pg_class AS conrel
ON conrel.oid = pg_constraint.conrelid
JOIN pg_class AS confrel
ON confrel.oid = pg_constraint.confrelid
JOIN pg_namespace AS conrel_nsp
ON conrel_nsp.oid = conrel.relnamespace
WHERE pg_constraint.contype = 'f'
) AS x
WHERE count_foreign_keys_to_same_table = 1
LOOP
RAISE NOTICE '%', sql_cmd;
EXECUTE sql_cmd;
END LOOP;
END
$_$;

For our example data model, this would produce:

ALTER TABLE public.orders RENAME CONSTRAINT orders_customer_id_fkey TO customers;
ALTER TABLE public.order_details RENAME CONSTRAINT order_details_order_id_fkey TO orders;
ALTER TABLE public.order_details RENAME CONSTRAINT order_details_product_id_fkey TO products;

To clarify what I mean with multiple foreign keys to the same table, here is an example:

CREATE TABLE p (
a int,
b int,
PRIMARY KEY (a),
UNIQUE (a,b)
);

CREATE TABLE f1 (
a int,
b int,
FOREIGN KEY (a) REFERENCES p
);

CREATE TABLE f2 (
a int,
b int,
FOREIGN KEY (a) REFERENCES p,
FOREIGN KEY (a,b) REFERENCES p(a,b)
);

For this example, only f1's foreign key constraint would be renamed:

ALTER TABLE public.f1 RENAME CONSTRAINT f1_a_fkey TO p;

Idea #2
=======

Allow user to define the default format for new foreign key constraint name.

The format could use template patterns similar to how e.g. to_char() works.
If a conflict is found, it would do the same as today, try appending an increasing integer.

Users could then decide on a company-wide consistent naming convention
on how foreign keys are usually named, which would reduce the need to manually name them
using the CONSTRAINT keyword.

Finally, just for fun, here is an example of how we could write the query above,
if we would have real foreign keys on the catalogs:

SELECT
format
(
'ALTER TABLE %I.%I RENAME CONSTRAINT %I TO %I;',
pg_constraint.conrel.pg_namespace.nspname,
pg_constraint.conrel.relname,
pg_constraint.conname,
pg_constraint.confrel.relname,
) AS sql_cmd,
COUNT(*) OVER (PARTITION BY pg_constraint.conrelid, pg_constraint.confrelid)
AS count_foreign_keys_to_same_table
FROM pg_constraint
WHERE pg_constraint.contype = 'f'

In this example the foreign key constraint names have been
derived from the column names since both conrelid and confrelid,
reference pg_class.

I think this is a good example of where this improves the situation the most,
when you have multiple joins of the same table, forcing you to come up with multiple aliases
for the same table, keeping them all in memory while writing and reading such queries.

> On second hand, it can be very nice to have some special strict mode in Postgres - maybe slower, not compatible, that disallow some dangerous or unsafe queries. But it is possible to solve in extensions, but nobody did it. Something like plpgsql_check for SQL - who will write sql_check?

Not a bad idea, this is a real problem, such a tool would be useful even with this proposed new syntax, as normal JOINs would continue to co-exist, for which nonsensical joins would still be possible.

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2021-03-30 08:10:54 Re: [PATCH] add concurrent_abort callback for output plugin
Previous Message Ajin Cherian 2021-03-30 07:39:31 Re: [PATCH] add concurrent_abort callback for output plugin