Re: Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hash Indexes
Date: 2016-10-01 02:43:25
Message-ID: CAA4eK1KtPW+gR4hafBYgdz-rd0V7XzmqMcTynA4P_u=6+VtYdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30-Sep-2016 10:26 PM, "Peter Geoghegan" <pg(at)heroku(dot)com> wrote:
>
> On Fri, Sep 30, 2016 at 4:46 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> > I would just be very disappointed if, after the amount of work that
> > Amit and others have put into this project, the code gets rejected
> > because somebody thinks a different project would have been more worth
> > doing.
>
> I wouldn't presume to tell anyone else how to spend their time, and am
> not concerned about this patch making the hash index code any less
> useful from the user's perspective. If this is how we remove the wart
> of hash indexes not being WAL-logged, that's fine by me. I'm trying to
> be helpful.
>

If that is fine, then I think we should do that. I want to bring it to
your notice that we have already seen and reported that with proposed set
of patches, hash indexes are good bit faster than btree, so that adds
additional value in making them WAL-logged.

> > As Tom said upthread: $But to kick the hash AM as such to the
> > curb is to say
> > "sorry, there will never be O(1) index lookups in Postgres".$ I
> > think that's correct and a sufficiently-good reason to pursue this
> > work, regardless of the merits (or lack of merits) of hash-over-btree.
>
> I don't think that "O(1) index lookups" is a useful guarantee with a
> very expensive constant factor.

The constant factor doesn't play much role when data doesn't have
duplicates or have lesser duplicates.

Amit seemed to agree with this, since
> he spoke of the importance of both theoretical performance benefits
> and practically realizable performance benefits.
>

No, I don't agree with that rather I think hash indexes are theoretically
faster than btree and we have seen that practically as well for quite a few
cases (for read workloads - when used with unique data and also in nest
loops).

With Regards,
Amit Kapila

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-10-01 02:56:24 Re: COPY as a set returning function
Previous Message Tom Lane 2016-10-01 02:24:08 contrib/pg_visibility craps out in assert-enabled builds