Re: BUG #16586: deduplicate_items=true can be configured for numeric indexes

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Cc: Matthias van de Meent <matthias(dot)vandemeent(at)cofano(dot)nl>
Subject: Re: BUG #16586: deduplicate_items=true can be configured for numeric indexes
Date: 2020-08-21 22:49:03
Message-ID: CAH2-Wzmc0FL+VLxwAH+1benCdPFX6ZnR_hYz1HurMuDJwa3UUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 20, 2020 at 4:52 AM PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> There is no error for specifying the "deduplicate_items" -flag. As
> deduplication is not supported for indexes with numeric type, I expected the
> index creation statement to error.

I don't think that there should be an error. While the
"equalimage"-ness of an operator class (such as btree/numeric_ops) is
in theory static, in practice it could change in either direction. For
example, it's possible (though very unlikely) that somebody will make
the mistake of marking an operator class as equalimage/dedup safe when
they shouldn't have. If this actually happens, a REINDEX shouldn't
raise errors with the same spelling of REINDEX that worked the first
time (e.g. when restoring a dump).

The deduplicate_items storage parameter is kind of an advisory thing.
Deduplication is always applied selectively in unique indexes, even
though it might be slightly better to do so consistently with some
workloads. Also, it's possible that we'll find a way to make some of
the operator classes (though not btree/numeric_ops) deduplication safe
in the future. For example, we could teach container types to report
their "equalimage"-ness by invoking the underlying support function of
contained types. So you could use deduplication with a composite type,
provided it didn't contain unsafe scalar types like numeric.

In general I don't expect that users will consciously think about
deduplication very often -- it's supposed to have very little overhead
in cases that don't benefit, so it will probably fade into the
background even in installations where it provides a lot of benefit. I
don't expect many users will want to make sure that it's enabled in
one index but definitely not enabled in another.

With all of that said, it would be nice if I could raise a NOTICE or
even a WARNING here if and only if the user spelled out
"deduplicate_items = on". Hard to see how to do that with the current
design of reloptions, though, unless it's okay to show it even when
"deduplicate_items = on" was not specifically provided (I don't think
that it's okay). An index access method (such as nbtree) can tell
whether or not all storage params should come from the defaults by
checking if the rel's rd_options is NULL or not, but that's not the
same thing -- it'll be set when fillfactor was explicitly set, for
example.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2020-08-22 00:59:54 Re: BUG #16547: ECPG can't CALL the procedure which has INOUT parameter
Previous Message PG Bug reporting form 2020-08-21 20:54:58 BUG #16589: Regression when using ADD UNIQUE+ADD FOREIGN KEY in same query in 13 beta