Re: BUG #17158: Distinct ROW fails with Postgres 14

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, sait(dot)nisanci(at)microsoft(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17158: Distinct ROW fails with Postgres 14
Date: 2021-08-27 10:36:04
Message-ID: c182f5c8-fdf3-80a0-fa43-4ed7e87d4d47@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


On 25.08.21 00:16, Tom Lane wrote:
> Undoing that would lose v14's ability to select hashed duplicate
> elimination for RECORD columns, but that's still not a regression
> because we didn't have it before. Moreover, anyone who's unhappy can
> work around the problem by explicitly casting the column to some
> suitable named composite type. We can leave it for later to make the
> planner smarter about anonymous record types. It clearly could be
> smarter, at least for the case of an explicit ROW construct at top
> level; but now is no time to be writing such code for v14.

This feature is a requirement for multicolumn path and cycle tracking in
recursive queries, as well as the search/cycle syntax built on top of
that, so there is a bit more depending on it than might be at first
apparent.

I've been looking at ways to repair this with minimal impact.
Essentially, we'd need a way ask the type cache to distinguish between
"do you have hash support if it's guaranteed to work" versus "hash
support is my only hope, so give it to me even if you're not completely
sure it will work". Putting this directly into the type cache does not
seem feasible with the current structure. But there aren't that many
callers of TYPECACHE_HASH_PROC*, so I looked at handling it there.

Variant 1 is that we let the type cache *not* report hash support for
the record type, and let callers fill it in. In the attached patch I've
only done this for hash_array(), because that's what's needed to get the
tests to pass, but similar code would be possible for row types, range
types, etc.

Variant 2 is that we let the type cache report hash support for the
record type, like now, and then let callers override it if they have
other options. This is the second attached patch.

It's basically fifty-fifty in terms of how many places you need to touch
in either case.

With both patches, you'll see the "union" regression test fail, which
includes a test case that is equivalent to the one from this bug report
(but using money instead of bit), but the "with" test still passes,
which covers the feature I mentioned at the beginning.

Thoughts?

Attachment Content-Type Size
0001-Fix-record-hash-support-variant-1.patch text/plain 8.1 KB
0001-Fix-record-hash-support-variant-2.patch text/plain 7.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Eisentraut 2021-08-27 11:21:52 Re: BUG #17148: About --no-strict-names option and --quiet option of pg_amcheck command
Previous Message PG Bug reporting form 2021-08-27 07:18:23 BUG #17163: spgist index scan statistics stays at 0