Re: BUG #17594: conditional hash indexes size (hash index ignore WHERE condition during CREATE INDEX?)

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17594: conditional hash indexes size (hash index ignore WHERE condition during CREATE INDEX?)
Date: 2022-08-26 08:04:12
Message-ID: CAK-MWwRyJ3+e0As1tk5MB_Fho8vjezD6Bbrue4riz5Kabk_qkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 25, 2022 at 11:49 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> > I found very weird situation with size of highly selective partial hash
> > indexes (they has almost same size as full table hash index).
>
> hashbuild() sets up the initial hash index size based on estimating
> the current number of rows in the table, without any correction for
> partial-index selectivity. I don't find this to be a bug particularly.
> The odds of making things worse via a bad estimate seem at least as
> high as the odds of making things better.
>
> regards, tom lane
>

Sometimes it lead to unexpected results, for hash indexes most common
(imho) use case is indexing long text/varchar columns on big tables where
they provide substantial size reduction over usual btree indexes (or in
extreme case of very long test columns - btree cannot be used at all). Now
in the case of a partial hash index over a huge table (with highly
selective conditions thus covering only a tiny part of rows) - size of
resulting hash index could be a huge surprise (in the practical case where
I found this issue it was 200MB hash index with only 5 rows in it).
Are most pages of hash index in that case will be completely empty and
almost never touched by the database (thus wasting only disk space) or the
resulting hash index will have very low rows per index page ratio (thus
inefficiently use shared buffers as well)?

--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk

"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно
когда я так делаю ещё раз?"

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-08-26 08:04:32 Re: pg_rewind WAL segments deletion pitfall
Previous Message PG Bug reporting form 2022-08-26 04:07:14 BUG #17596: "invalid attribute number 11" when updating partitioned table with a MULTIEXPR_SUBLINK