Re: slow queries over information schema.tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: slow queries over information schema.tables
Date: 2018-12-20 03:31:59
Message-ID: CAFj8pRDrfRR8MNxPTuPnN27FUcNeKwiPgiFttzt5K_XVWKOH7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 20. 12. 2018 v 0:14 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> I wrote:
> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> >> Slow query
> >> select * from information_schema.tables where table_name = 'pg_class';
>
> > Yeah. This has been complained of many times before.
>
> > The core of the problem, I think, is that we're unable to convert the
> > condition on table_name into an indexscan on pg_class.relname, because
> > the view has cast pg_class.relname to the sql_identifier domain.
>
> > There are two different issues in that. One is that the domain might
> > have constraints (though in reality it does not), so the planner can't
> > throw away the CoerceToDomain node, and thus can't match the expression
> > to the index. Even if we did throw away the CoerceToDomain, it still
> > would not work because the domain is declared to be over varchar, and
> > so there's a cast-to-varchar underneath the CoerceToDomain.
>
> After my last few commits, the only issue that's left here is the
> cast-to-varchar implied by casting to sql_identifier. Upthread
> I showed a possible planner hack to get rid of that, and we could
> still solve it that way so far as allowing indexscans on catalogs
> is concerned. However, I wonder what people would think of a
> more aggressive approach, viz:
>
> diff --git a/src/backend/catalog/information_schema.sql
> b/src/backend/catalog/information_schema.sql
> index 0fbcfa8..3891e3b 100644
> --- a/src/backend/catalog/information_schema.sql
> +++ b/src/backend/catalog/information_schema.sql
> @@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying
> COLLATE "C";
> * SQL_IDENTIFIER domain
> */
>
> -CREATE DOMAIN sql_identifier AS character varying COLLATE "C";
> +CREATE DOMAIN sql_identifier AS name;
>
>
>
> I've not checked to verify that sql_identifier is used for all and only
> those view columns that expose "name" catalog columns. If the SQL
> committee was sloppy about that, this idea might not work. But assuming
> that the length restriction is valid for the columns that have this
> type, would this be an OK idea? It does seem to fix the poor-plan-quality
> problem at a stroke, with no weird planner hacks.
>
> What I find in the SQL spec is
>
> 5.5 SQL_IDENTIFIER domain
>
> Function
>
> Define a domain that contains all valid <identifier body>s and
> <delimited identifier body>s.
>
> Definition
>
> CREATE DOMAIN SQL_IDENTIFIER AS
> CHARACTER VARYING (L)
> CHARACTER SET SQL_IDENTIFIER;
>
> GRANT USAGE ON DOMAIN SQL_IDENTIFIER
> TO PUBLIC WITH GRANT OPTION;
>
> Description
>
> 1) This domain specifies all variable-length character values that
> conform to the rules for formation and representation of an SQL
> <identifier body> or an SQL <delimited identifier body>.
>
> NOTE 4 - There is no way in SQL to specify a <domain
> constraint> that would be true for the body of any valid SQL
> <regular identifier> or <delimited identifier> and false for
> all
> other character string values.
>
> 2) L is the implementation-defined maximum length of <identifier
> body> and <delimited identifier body>.
>
> So we'd be violating the part of the spec that says that the domain's
> base type is varchar, but considering all the other requirements here
> that we're blithely ignoring, maybe that's not such a sin. With the
> recent collation changes, type name is hard to functionally distinguish
> from a domain over varchar anyway. Furthermore, since name's length limit
> corresponds to the "implementation-defined maximum length" part of the
> spec, you could argue that in some ways this definition is closer to the
> spec than what we've got now.
>
> Thoughts?
>

The very common will be compare with text type - some like

SELECT * FROM information_schema.tables WHERE table_name =
lower('somename');

> regards, tom lane
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-12-20 03:44:33 Switching to 64-bit Bitmapsets
Previous Message John Naylor 2018-12-20 02:59:43 Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)