Re: BUG #16586: deduplicate_items=true can be configured for numeric indexes

From: Matthias van de Meent <matthias(dot)vandemeent(at)cofano(dot)nl>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16586: deduplicate_items=true can be configured for numeric indexes
Date: 2020-09-02 10:43:56
Message-ID: CAAs3B9q6Bcc+2u7=rynO_BL65RSPCOAJWeYD4WU3AgGdsG3wuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, 22 Aug 2020 at 00:49, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Aug 20, 2020 at 4:52 AM PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
> > There is no error for specifying the "deduplicate_items" -flag. As
> > deduplication is not supported for indexes with numeric type, I expected the
> > index creation statement to error.
>
> I don't think that there should be an error. While the
> "equalimage"-ness of an operator class (such as btree/numeric_ops) is
> in theory static, in practice it could change in either direction. For
> example, it's possible (though very unlikely) that somebody will make
> the mistake of marking an operator class as equalimage/dedup safe when
> they shouldn't have. If this actually happens, a REINDEX shouldn't
> raise errors with the same spelling of REINDEX that worked the first
> time (e.g. when restoring a dump).
>
> The deduplicate_items storage parameter is kind of an advisory thing.

The current documentation is quite unclear about that, as the flag
itself is documented as "Controls usage of the B-tree deduplication
technique described in Section 63.4.2.". A note "Even when configured,
the feature will not be used if it does not pass the limitations as
described in section 63.4.2" would help in preventing confusion.

> Deduplication is always applied selectively in unique indexes, even
> though it might be slightly better to do so consistently with some
> workloads. Also, it's possible that we'll find a way to make some of
> the operator classes (though not btree/numeric_ops) deduplication safe
> in the future. For example, we could teach container types to report
> their "equalimage"-ness by invoking the underlying support function of
> contained types. So you could use deduplication with a composite type,
> provided it didn't contain unsafe scalar types like numeric.
>
> In general I don't expect that users will consciously think about
> deduplication very often -- it's supposed to have very little overhead
> in cases that don't benefit, so it will probably fade into the
> background even in installations where it provides a lot of benefit. I
> don't expect many users will want to make sure that it's enabled in
> one index but definitely not enabled in another.
>
> With all of that said, it would be nice if I could raise a NOTICE or
> even a WARNING here if and only if the user spelled out
> "deduplicate_items = on". Hard to see how to do that with the current
> design of reloptions, though, unless it's okay to show it even when
> "deduplicate_items = on" was not specifically provided (I don't think
> that it's okay). An index access method (such as nbtree) can tell
> whether or not all storage params should come from the defaults by
> checking if the rel's rd_options is NULL or not, but that's not the
> same thing -- it'll be set when fillfactor was explicitly set, for
> example.

Thanks for the reply, it was very insightful.

- Matthias

> --
> Peter Geoghegan

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2020-09-02 10:58:49 BUG #16605: PostgreSQL recovery startup process waiting and blocking to application queries
Previous Message Oleksandr Shulgin 2020-09-02 07:54:37 Re: BUG #16486: Prompted password is ignored when password specified in connection string