Quick Links

Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column

From:	Peter Geoghegan <pg(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Daniel Newman <dtnewman(at)gmail(dot)com>, danielnewman(at)umich(dot)edu, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject:	Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
Date:	2016-07-16 02:21:06
Message-ID:	CAM3SWZTBAo4hjbBd780+MrOKiKp_TMo1N3A0Rw9_im8gbD7fQA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Mon, Jun 27, 2016 at 1:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> I like this idea. Should I write a patch?
>
> BTW, while you're at it: it strikes me that the threshold should be
> either min(NBuffers, maintenance_work_mem) or
> min(NLocBuffer, maintenance_work_mem), depending on whether we're
> talking about a regular or temp table/index. That is, there's a
> pre-existing bug here that when NLocBuffer is a good deal less than
> NBuffers (which is the typical case nowadays) you'll get a lot of
> thrashing between local buffers and kernel cache, if the index isn't
> quite big enough to trigger the sorting code. This might not manifest
> as actual I/O, but it's not the intended behavior either.

Finally found time for this. Attached patch tests hash tuplesorts
along the lines we discussed. It adds one new tuplesort operation,
which does not spill to disk. It also asserts that hash values
retrieved through the tuplesort interface are in fact in sorted order.
I wanted to have something reliably trip when comparetup_index_hash()
gives wrong answers, even when a corrupt final index is perhaps not a
consequence of the underlying bug.

Due to how the hash build code works, maintenance_work_mem will need
to be 2 blocks smaller than the final index size in order to force a
sort (see code comments). I think that this does not matter, since
none of this is documented currently.

Having reviewed the coverage report for tuplesort.c [1], I think that
the only notable testing omission once this latest patch is committed
will be that there is no coverage for "backwards get tuple" cases
(this is more noticeable when looking at coverage statistics for
logtape.c, actually). This seems less important to me, and would be
awkward to test in any case, given my constraints.

[1] http://coverage.postgresql.org/src/backend/utils/sort/tuplesort.c.gcov.html
--
Peter Geoghegan

Attachment	Content-Type	Size
0001-Add-regression-test-cases-for-hash-tuplesorts.patch	text/x-patch	8.9 KB

In response to

Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column at 2016-06-27 20:22:25 from Tom Lane

Responses

Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column at 2016-07-16 19:35:36 from Tom Lane

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Amit Kapila	2016-07-16 12:55:08	Re: Invalid indexes should not consume update overhead
Previous Message	Charles	2016-07-15 18:48:21	Using row-level security results in less optimal query plans