Quick Links

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

From:	Stefan Keller <sfkeller(at)gmail(dot)com>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	Peter Geoghegan <peter(at)2ndquadrant(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Date:	2011-09-18 14:59:10
Message-ID:	CAFcOn2_NnrzquFQ+YPw2vh=ioC+KXcE1NdEJZWG4O7pmY1Eu3Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Merlin and Jeff,

General remark again:It's hard for me to imagine that btree is
superior for all the issues mentioned before. I still believe in hash
index for primary keys and certain unique constraints where you need
equality search and don't need ordering or range search.

2011/9/17 Jeff Janes <jeff(dot)janes(at)gmail(dot)com>:
(...)
> Also, that link doesn't address concurrency of selects at all, only of inserts.

How would (or did) you test and benchmark concurrency of inserts and selects?
Use pgbench with own config for a blackbox test?

2011/9/18 Merlin Moncure <mmoncure(at)gmail(dot)com>:
> Here again, I think that any proposed improvement in the current hash
> index code should be measured against wrapping a btree index. You
> get wal logging and high concurrency for free if you decide to do
> that.

As I understand, this would be an enhancement of btree. That's ok for
btree but not really exploiting all advantages of a separate hash
index, would'nt it?

Stefan

2011/9/18 Merlin Moncure <mmoncure(at)gmail(dot)com>:
> On Sat, Sep 17, 2011 at 4:48 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
>>> On 14 September 2011 00:04, Stefan Keller <sfkeller(at)gmail(dot)com> wrote:
>>>> Has this been verified on a recent release? I can't believe that hash
>>>> performs so bad over all these points. Theory tells me otherwise and
>>>> http://en.wikipedia.org/wiki/Hash_table seems to be a success.
>>
>> My understanding is that a huge amount of work has gone into making
>> btree what it is in
>> PG, and not nearly as much work has gone into making hash indexes what
>> they could be.
>>
>>
>>> Hash indexes have been improved since 2005 - their performance was
>>> improved quite a bit in 9.0. Here's a more recent analysis:
>>>
>>> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/
>>
>> They are 3 time faster to build. But if you rip the WAL logging out
>> of btree, how much faster would those get?
>>
>> Also, that link doesn't address concurrency of selects at all, only of inserts.
>
> Of course hash indexes are faster to build than varlen string indexes
> :-). I use natural keys 50-80% of the time and hash indexing would
> remove some of the pain in cases where I don't need ordering and range
> operations. In fact, if they are made to properly support wal logging
> and uniqueness, I imagine they should supplant btree in a broad range
> of cases, so much so that it would be awful nice to be able to have
> syntax to choose hash for primary keys and unique constraints.
>
> @ Jeff:
>>I think that adding WAL to hash indexes without first
> addressing the heavy-weight locking issue would be a mistake.
> Even if the WAL was fixed, the bad performance under
> concurrent selects would still make it at best a narrow
> niche thing. And fixing the locking *after* WAL is in place would
> probably be very much harder than the other order.
>
> Here again, I think that any proposed improvement in the current hash
> index code should be measured against wrapping a btree index. You
> get wal logging and high concurrency for free if you decide to do
> that.
>
> merlin
>

In response to

Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-17 22:14:55 from Merlin Moncure

Responses

Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-19 02:14:39 from Jeff Janes
Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-19 11:13:37 from Robert Klemme
Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-19 14:04:02 from Merlin Moncure

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Kevin Grittner	2011-09-18 15:17:17	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Previous Message	Jeff Janes	2011-09-17 22:29:16	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?