Re: Use of additional index columns in rows filtering

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-07-24 17:54:31
Message-ID: CAH2-Wz=-u6t9qrMqCd81mk1cjHE1wKj-Esx8q9bGu4BYUWk_XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 24, 2023 at 10:36 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> I'm getting a bit lost in this discussion as well -- for the purposes
> of this feature, we only need to know whether to push down a clause as
> an Index Filter or not, right?

I think so.

> Could we start out conservatively and push down as an Index Filter
> unless there is some other clause ahead of it that can't be pushed
> down? That would allow users to have some control by writing clauses in
> the desired order or wrapping them in functions with a declared cost.

I'm a bit concerned about cases like the one I described from the
regression tests.

The case in question shows a cheaper plan replacing a more expensive
plan -- so it's a win by every conventional measure. But, the new plan
is less robust in the sense that I described yesterday: it will be
much slower than the current plan when there happens to be many more
"thousand = 42" tuples than expected. We have a very high chance of a
small benefit (we save repeated index page accesses), but a very low
chance of a high cost (we incur many more heap accesses). Which seems
less than ideal.

One obvious way of avoiding that problem (that's probably overly
conservative) is to just focus on the original complaint from Maxim.
The original concern was limited to non-key columns from INCLUDE
indexes. If you only apply the optimization there then you don't run
the risk of generating a path that "out competes" a more robust path
in the sense that I've described. This is obviously true because there
can't possibly be index quals/scan keys for non-key columns within the
index AM.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-07-24 17:58:45 Re: Use of additional index columns in rows filtering
Previous Message Tom Lane 2023-07-24 17:53:28 Removing the fixed-size buffer restriction in hba.c