Quick Links

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Stefan Keller <sfkeller(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Date:	2011-09-16 01:00:17
Message-ID:	CAHyXU0yHQM3e6afLuQETk+LKD-TZO39mrF2uL+rk0JD6RCeotA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Thu, Sep 15, 2011 at 5:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> HM, what if you junked the current hash indexam, and just implemented
>> a wrapper over btree so that the 'hash index' was just short hand for
>> hashing the value into a standard index?
>
> Surely creating such a wrapper would be *more* work than adding WAL
> support to the hash AM.
>
> I'm not entirely following this eagerness to junk that AM, anyway.
> We've put a lot of sweat into it over the years, in the hopes that
> it would eventually be good for something. It's on the edge of
> being good for something now, and there's doubtless room for more
> improvements, so why are the knives out?

Just making an observation. Some quick tests follow the sig. I think
the point here is that something has to be done -- now that the
replication train has left the station, not having WAL has gone from
quirky annoyance to major functionality failure. The recent hash work
has brought down index build times to a reasonable level, but they are
still getting beat by btree. Of course, it's not quite apples to
apples (I figure the timings will even out to an extent once you add
in the hashing wrapper), but I can't help but wonder if the btree code
is a better driver and consolidating code is a good thing.

merlin

postgres=# create table v as select generate_series(1,10000000) as x;
SELECT 10000000
Time: 16750.961 ms
postgres=# create index on v(x);
CREATE INDEX
Time: 15158.637 ms
postgres=# create index on v using hash(x);
CREATE INDEX
Time: 22505.468 ms

postgres=# \d v
Table "public.v"
Column | Type | Modifiers
--------+---------+-----------
x | integer |
Indexes:
"v_x_idx" btree (x)
"v_x_idx1" hash (x)

postgres=# select relname, relfilenode from pg_class where relname like 'v_x%';
relname | relfilenode
----------+-------------
v_x_idx | 16525
v_x_idx1 | 16526
(2 rows)

c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16525
09/15/2011 07:46 PM 224,641,024 16525

c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16526
09/15/2011 07:49 PM 268,451,840 16526

In response to

Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-15 22:38:08 from Tom Lane

Responses

Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-16 01:04:58 from Claudio Freire
Re: Hash index use presently(?) discouraged since 2005: revive or bury it? at 2011-09-16 04:20:43 from Merlin Moncure

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Claudio Freire	2011-09-16 01:04:58	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Previous Message	Claudio Freire	2011-09-16 00:34:04	Re: Hash index use presently(?) discouraged since 2005: revive or bury it?