Re: Potential optimisation for the creation of a partial index condition over a newly created nullable, non-default-valued column?

From: Tim Dawborn <tim(dot)dawborn(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: "Psql_General (E-mail)" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Potential optimisation for the creation of a partial index condition over a newly created nullable, non-default-valued column?
Date: 2020-06-06 08:53:00
Message-ID: CAN9Kr4BV9owUUcKK8QgdDdP96NQVuOeZwyicZfPxMdJsLDP78Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 5 Jun 2020 at 20:15, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> While it could be technically possible to do something like check the
> xmin of the pg_attribute record for all columns mentioned in the
> index's predicate all are set to the current transaction ID and the
> index predicate refutes an expression containing those columns with IS
> NULL clauses or whatever the DEFAULT expression value is, we've just
> no way to know if any rows were inserted or updated between the ALTER
> TABLE and the CREATE INDEX. Certainly, no other transaction could have
> done anything since we added the column due to us holding the
> AccessExclusiveLock. We just don't really have a way to know if our
> own transaction did anything in between. So to do this, we'd need to
> invent something to track that, and that something wouldn't be free to
> maintain and we'd likely need to maintain it all the time since we'd
> be unable to predict what future commands might need to use that
> information.
>

I see. Thanks for explaining. I'm not familiar with the internals of
Postgres. I thought there might have already been something akin to a
"dirty" bit for each table for each transaction to indicate whether or not
it had been modified in some way, which could have been used for this
hypothetical process. It sounds like that's not the case, in which case
yes, the additional overhead of this bookkeeping required for this feature
would be unjustifiable.

> It would likely be easier to go about it by having some sort of ALTER
> TABLE ADD INDEX command, then just chain the alter tables together. We
> have various other optimisations when multiple subcommands are used in
> a single ALTER TABLE. However, I'm not suggesting we go and allow
> indexes to be created in ALTER TABLE. I"m just suggesting that it
> would likely be a better alternative than inventing something to track
> when a table last had
>

Nice. I hadn't considered that as a way to go about solving this problem.
Having a quick search online for ALTER TABLE CREATE INDEX syntax, it
appears that MySQL does support this notion (via ALTER TABLE ADD INDEX
<https://dev.mysql.com/doc/refman/8.0/en/alter-table.html>), though it
doesn't (currently) support partial indexes. Having a quick look in the
mailing list archives, I couldn't find any discussions about supporting
this syntax. What are the arguments against adding such additional syntax
(other than the standard issues that come with adding additional syntax)?

Cheers,
Tim

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Laura Smith 2020-06-06 08:58:42 Re: Postgres12 - Confusion with pg_restore
Previous Message Rob Sargent 2020-06-06 04:24:17 Re: Multitenent architecture