Quick Links

Re: Multicolumn hash indexes

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tomasz Ostrowski <tometzky+pg(at)ato(dot)waw(dot)pl>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Subject:	Re: Multicolumn hash indexes
Date:	2017-09-29 14:45:49
Message-ID:	CA+TgmoY9O9CesFCivK8Pq77aon5Sy=6LTTb6y_hcUYAQ1NxpoA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Sep 27, 2017 at 5:52 PM, Tomasz Ostrowski
<tometzky+pg(at)ato(dot)waw(dot)pl> wrote:
> I feel that this would eliminate a large amount of potential gains from such
> an index. This would be usable only when a sufficiently variable column
> exists, in which case a simple hash index on the column wouldn't be much
> worse.

If the index isn't very selective and any query will match many rows
anyway, I think it will often be superior to use a BRIN index (but I
might be wrong).

Maybe you're worrying about something like a billion-row table where
there are 3 columns that form a composite key: (1,1,1), (1,1,2), ...,
(1,1000),(1,2,1),...,(1,1000,1000),(2,1,1),...,(1000,1000,1000). In
that case, treating the leading column as most important will indeed
be terrible, since we'll put all billion rows into 1000 buckets no
matter how many bucket splits we do.

That seems a little unusual, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Multicolumn hash indexes at 2017-09-27 21:52:19 from Tomasz Ostrowski

Responses

Re: Multicolumn hash indexes at 2017-09-29 14:54:55 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Korotkov	2017-09-29 14:50:48	Re: GSoC 2017: weekly progress reports (week 4) and patch for hash index
Previous Message	Bossart, Nathan	2017-09-29 14:33:18	Re: [Proposal] Allow users to specify multiple tables in VACUUM commands