Re: WTF with hash index?

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Олег Самойлов <splarv(at)ya(dot)ru>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: WTF with hash index?
Date: 2018-11-13 17:55:49
Message-ID: 53b360cd94eaf9823e3b8746c953351f54cdde80.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Олег Самойлов wrote:
> \set table_size 1000000
> begin;
> create table gender (gender varchar);
>
> insert into gender (gender) select case when random<0.50 then 'female' when random<0.99 then 'male' else 'other' end from (select random() as random, generate_series(1,:table_size)) as subselect;
>
> create index gender_btree on gender using btree (gender);
> create index gender_hash on gender using hash (gender);
> commit;
> vacuum full analyze;
>
> Vacuum full is not necessary here, just a little vodoo programming. I expected that the hash index will be much smaller and quicker than the btree index, because it doesn’t keep values inside itself, only hashes. But:
>
> => \d+
> List of relations
> Schema | Name | Type | Owner | Size | Description
> --------+--------+-------+-------+-------+-------------
> public | gender | table | olleg | 35 MB |
> (1 row)
>
> => \di+
> List of relations
> Schema | Name | Type | Owner | Table | Size | Description
> --------+--------------+-------+-------+--------+-------+-------------
> public | gender_btree | index | olleg | gender | 21 MB |
> public | gender_hash | index | olleg | gender | 47 MB |
> (2 rows)
>
> The hash index not only is more than the btree index, but also is bigger than the table itself. What is wrong with the hash index?

I guess the problem here is that there are so few distinct values, so
all the index items end up in only three hash buckets, forming large
linked lists.

I can't tell off-hand why that would make the index so large though.

Anyway, indexes are pretty useless in such a case.

Is the behavior the same if you have many distinct values?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Kretschmer 2018-11-13 18:07:47 Re: WTF with hash index?
Previous Message Олег Самойлов 2018-11-13 16:42:07 WTF with hash index?