Re: [PATCH] nbtree: Do not show debugmessage if deduplication is disabled

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] nbtree: Do not show debugmessage if deduplication is disabled
Date: 2020-12-17 19:12:20
Message-ID: CAH2-Wz=FM=oAK9jOwb1Rz9CuPLQRD=fapAK7Cgq4XwjdSUPbQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 16, 2020 at 5:28 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> Even though the message literally says whether the index "can safely" or
> "cannot" use deduplication, the function specifically avoids the debug message
> for system columns, so I think it also makes sense to hide it when
> deduplication is turned off.

I disagree. The point of the message is to advertise whether
deduplication is possible in principle for indexes where support is
not precluded by a significant design issue that will almost certainly
not change in the future. The debug message should only apply to
indexes without INCLUDE non-key columns that are not system catalog
indexes.

In general, I think of the storage parameter as advisory. If it wasn't
advisory then we'd have no way of rescinding support for deduplication
in the event of an opclass that somehow gets the "equality implies
image equality" question wrong. If it wasn't advisory then we might
end up raising an error when the user explicitly asks for
deduplication but that isn't possible -- which might break somebody's
pg_restore workflow.

Even when deduplication is both the safe and the desired behavior,
there is at least one case where it's applied selectively. We do this
in unique indexes, where deduplication can only help with version
churn duplicates and so we only try to deduplicate when that appears
to be a factor. By the same token, when the user disables
deduplication via the storage parameter (presumably due to the
performance trade-off somehow not seeming useful), they cannot expect
to get back to an on-disk representation without posting list tuples,
unless and until they REINDEX.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-12-17 19:19:19 Re: Optimizing the documentation
Previous Message Alexander Korotkov 2020-12-17 19:10:56 Re: range_agg