Re: hashjoins, index loops to retrieve pk/ux constrains in pg12

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Arturas Mazeika <mazeika(at)gmail(dot)com>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: hashjoins, index loops to retrieve pk/ux constrains in pg12
Date: 2021-09-28 14:13:57
Message-ID: 2835979.1632838437@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Arturas Mazeika <mazeika(at)gmail(dot)com> writes:
> Thanks a lot for having a look at the query once again in more detail. In
> short, you are right, I fired the liquibase scripts and observed the exact
> query that was hanging in pg_stats_activity. The query was:

> SELECT
> FK.TABLE_NAME as "TABLE_NAME"
> , CU.COLUMN_NAME as "COLUMN_NAME"
> , PK.TABLE_NAME as "REFERENCED_TABLE_NAME"
> , PT.COLUMN_NAME as "REFERENCED_COLUMN_NAME"
> , C.CONSTRAINT_NAME as "CONSTRAINT_NAME"
> FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS C
> INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS FK ON
> C.CONSTRAINT_NAME = FK.CONSTRAINT_NAME
> INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS PK ON
> C.UNIQUE_CONSTRAINT_NAME = PK.CONSTRAINT_NAME
> INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE CU ON C.CONSTRAINT_NAME
> = CU.CONSTRAINT_NAME
> INNER JOIN (
> SELECT
> i1.TABLE_NAME
> , i2.COLUMN_NAME
> FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS i1
> INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE i2 ON
> i1.CONSTRAINT_NAME = i2.CONSTRAINT_NAME
> WHERE i1.CONSTRAINT_TYPE = 'PRIMARY KEY'
> ) PT ON PT.TABLE_NAME = PK.TABLE_NAME WHERE
> lower(FK.TABLE_NAME)='secrole_condcollection'

TBH, before worrying about performance you should be worrying about
correctness. constraint_name alone is not a sufficient join key
for these tables, so who's to say whether you're even getting the
right answers?

Per SQL spec, the join key to use is probably constraint_catalog
plus constraint_schema plus constraint_name. You might say you
don't need to compare constraint_catalog because that's fixed
within any one Postgres database, and that observation would be
correct. But you can't ignore the schema.

What's worse, the SQL-spec join keys are based on the assumption that
constraint names are unique within schemas, which is not enforced in
Postgres. Maybe you're all right here, because you're only looking
at primary key constraints, which are associated with indexes, which
being relations do indeed have unique-within-schema names. But you
still can't ignore the schema.

On the whole I don't think you're buying anything by going through
the SQL-spec information views, because this query is clearly pretty
dependent on Postgres-specific assumptions even if it looks like it's
portable. And you're definitely giving up a lot of performance, since
those views have so many complications from trying to map the spec's
view of whats-a-constraint onto the Postgres objects (not to mention
the spec's arbitrary opinions about which objects you're allowed to
see). This query would be probably be simpler, more correct, and a
lot faster if rewritten to query the Postgres catalogs directly.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Daniel Diniz 2021-09-28 15:39:52 Problem with indices from 10 to 13
Previous Message ldh@laurent-hasson.com 2021-09-28 04:23:05 RE: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4