Quick Links

Re: Use of additional index columns in rows filtering

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject:	Re: Use of additional index columns in rows filtering
Date:	2023-08-02 00:50:14
Message-ID:	CAH2-WzkGJGeGK7n7v6DNts19B+UzHGrLQsZwQtGt2OwEY8G3XQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Jul 24, 2023 at 11:59 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > That might be true but I'm not sure what to do about that unless we
> > incorporate some "robustness" measure into the costing. If every
> > measure we have says one plan is better, don't we have to choose it?
>
> I'm mostly concerned about the possibility itself -- it's not a matter
> of tuning the cost. I agree that that approach would probably be
> hopeless.

This seems related to the fact that EXPLAIN doesn't expose the
difference between what Markus Winand calls "Access Predicates" and
"Index Filter Predicates", as explained here:

https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates

That is, both "Access Predicates" and "Index Filter Predicates" are
shown after an "Index Cond: " entry in Postgres EXPLAIN output, in
general. Even though these are two very different things. I believe
that the underlying problem for the implementation (the reason why we
can't easily break this out further in EXPLAIN output) is that we
don't actually know what kind of predicate it is ourselves -- at least
not until execution time. We wait until then to do nbtree
preprocessing/scan setup. Though perhaps we should do more of this
during planning instead [1], for several reasons (fixing this is just
one of those reasons).

The risk to "robustness" for cases like the one I drew attention to on
this thread would probably have been obvious all along if EXPLAIN
output were more like what Markus would have us do -- he certainly has
a good point here, in general.

Breaking things out in EXPLAIN output along these lines might also
give us a better general sense of when a similar plan shift like this
was actually okay -- even according to something like my
non-traditional "query robustness" criteria. It's much harder for me
to argue that a shift in plans from what Markus calls an "Index Filter
Predicate" to what the patch will show under "Index Filter:" is a
novel new risk. That would be a much less consequential difference,
because those two things are fairly similar anyway.

Besides, such a shift in plan would have to "legitimately win" for
things to work out like this. If we're essentially picking between two
different subtypes of "Index Filter Predicate", then there can't be
the same weird second order effects that we see when an "Access
Predicate" is out-competed by an "Index Filter Predicate". It's
possible that expression evaluation of a small-ish conjunctive
predicate like "Index Filter: ((tenthous = 1) OR (tenthous = 3) OR
(tenthous = 42))" will be faster than a natively executed SAOP. You
can't do that kind of expression evaluation in the index AM itself
(assuming that there is an opclass for nbtree to use in the first
place, which there might not be in the case of any non-key INCLUDE
columns). With the patch, you can do all this. And I think that you
can derisk it without resorting to the overly conservative approach of
limiting ourselves to non-key columns from INCLUDE indexes.

To summarize: As Markus says on the same page. "Index filter
predicates give a false sense of safety; even though an index is used,
the performance degrades rapidly on a growing data volume or system
load". That's essentially what I want to avoid here. I'm much less
concerned about competition between what are really "Index Filter
Predicate" subtypes. Allowing that competition to take place is not
entirely risk-free, of course, but it seems about as risky as anything
else in this area.

[1] https://www.postgresql.org/message-id/2587523.1647982549@sss.pgh.pa.us
--
Peter Geoghegan

In response to

Re: Use of additional index columns in rows filtering at 2023-07-24 18:59:32 from Peter Geoghegan

Responses

Re: Use of additional index columns in rows filtering at 2023-08-02 13:48:28 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andy Fan	2023-08-02 01:05:44	Re: Extract numeric filed in JSONB more effectively
Previous Message	Ranier Vilela	2023-08-02 00:46:24	Re: Support to define custom wait events for extensions