Quick Links

Re: Next Steps with Hash Indexes

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	Sadhuprasad Patro <b(dot)sadhu(at)gmail(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Next Steps with Hash Indexes
Date:	2021-09-23 05:41:23
Message-ID:	CAFiTN-vYxT1-F-cPqvWL=u0sTa1H8jnoZ8hkVAcPGmxBqQFG3w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro <b(dot)sadhu(at)gmail(dot)com> wrote:
>
> > > One more thing to consider is that it seems that the planner requires
> > > a condition for the first column of an index before considering an
> > > indexscan plan. See Tom's email [1] in this regard. I think it would
> > > be better to see what kind of work is involved there if you want to
> > > explore a single hash value for all columns idea.
> > >
> > > [1] - https://www.postgresql.org/message-id/29263.1506483172%40sss.pgh.pa.us
> >
> > About this point, I will analyze further and update.
> >
>
> I have checked the planner code, there does not seem to be any
> complicated changes needed to cover if we take up a single hash value
> for all columns... Below are the major part of changes needed:
>
> In build_index_paths(), there is a check like, "if (index_clauses ==
> NIL && !index->amoptionalkey)", which helps to figure out if the
> leading column has any clause or not. This needs to be moved out of
> the loop and check for clauses on all key columns.
> With this we need to add a "amallcolumncluse" field to Index
> structure, which will be set to TRUE for HASH index and FALSE in other
> cases.

Right we can add an AM level option and based on that we can decide
whether to select the index scan if conditions are not given for all
the key columns. And changes don't look that complicated.

>
> And to get the multi-column hash index selected, we may set
> enable_hashjoin =off, to avoid any condition become join condition,
> saw similar behaviors in other DBs as well...

This may be related to Tom's point that, if some of the quals are
removed due to optimization or converted to join quals, then now, even
if the user has given qual on all the key columns the index scan will
not be selected because we will be forcing that the hash index can
only be selected if it has quals on all the key attributes?

I don't think suggesting enable_hashjoin =off is a solution, this can
happen with merge join or the nested loop join with materialized node,
in all such cases join filter can not be pushed down to the inner node
because the outer node will not start to scan until we
materialize/sort/hash the inner node. But yeah if we test this
behavior in other databases also and if it appeared that this is how
the hash index is being used then maybe this behavior can be
documented.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: Next Steps with Hash Indexes at 2021-09-23 04:33:56 from Sadhuprasad Patro

Responses

Re: Next Steps with Hash Indexes at 2021-09-27 05:52:34 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2021-09-23 05:53:25	Re: logical replication restrictions
Previous Message	Amul Sul	2021-09-23 05:02:01	Re: Deduplicate code updating ControleFile's DBState.