Re: Hash index initial size is too large given NULLs or partial indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash index initial size is too large given NULLs or partial indexes
Date: 2019-03-09 04:25:08
Message-ID: CAA4eK1L8brCWKYTbNwcTD_-YSECDph2SNVVX5-D1JPBQJRvMSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 8, 2019 at 11:57 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 3/8/19 7:14 PM, Jeff Janes wrote:
>
> > This goes back to when the pre-sizing was implemented in 2008
> > (c9a1cc694abef737548a2a). It seems to be an oversight, rather than
> > something that was considered.
> >
> > Is this a bug that should be fixed? Or if getting a more accurate
> > estimate is not possible or not worthwhile, add a code comment about that?
> >
>
> I'd agree this smells like a bug (or perhaps two). The sizing probably
> should consider both null_frac and selectivity of the index predicate.
>

Like you guys, I also think this area needs improvement. I am not
sure how easy it is to get the selectivity of the predicate in this
code path. If we see how we do it in set_plain_rel_size() during path
generation in the planner, we can get some idea.

Another idea could be that we don't create the buckets till we know
the exact tuples returned by IndexBuildHeapScan. Basically, I think
we need to spool the tuples, create the appropriate buckets and then
insert the tuples. We might want to do this only when some index
predicate is present.

If somebody is interested in doing the leg work, I can help in
reviewing the patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-03-09 05:10:42 Re: \describe*
Previous Message Pavel Stehule 2019-03-09 04:17:04 Re: PostgreSQL vs SQL/XML Standards