Re: Use index to estimate expression selectivity

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bono Stebler <bono(dot)stebler(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Use index to estimate expression selectivity
Date: 2023-11-23 17:30:58
Message-ID: 3130619.1700760658@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bono Stebler <bono(dot)stebler(at)gmail(dot)com> writes:
> After discussing the issue on irc, it looks like it could be possible
> for the planner to use a partial index matching an expression exactly to
> estimate its selectivity.

I think going forward we're going to be more interested in extending
CREATE STATISTICS than in adding special behaviors around indexes.
An index is a pretty expensive thing to maintain if you really only
want some statistics. Contrariwise, if you need the index for
functional reasons (perhaps to enforce some strange uniqueness
constraint) but you don't like some decision the planner takes because
of the existence of that index, you're kind of stuck. So decoupling
this stuff makes more sense from where I sit.

Having said that ...

> Here is a simplified version (thanks ysch) of the issue I am facing:
> https://dbfiddle.uk/flPq8-pj
> I have tried using CREATE STATISTICS as well but haven't found a way to
> improve the planner estimation for that query.

I assume what you did was try to make stats on "synchronized_at IS
DISTINCT FROM updated_at"? Yeah, it does not surprise me that we fail
to match that to this query. The trouble with expression statistics
(and expression indexes) is that it's impractical to match every
subexpression of the query to every subexpression that might be
presented by CREATE STATISTICS: you soon get into exponential
behavior. So there's a limited set of contexts where we look for
a match.

I experimented a bit and found that if you do have statistics on that,
then "WHERE (synchronized_at IS DISTINCT FROM updated_at) IS TRUE"
will consult the stats. Might do as a hacky workaround.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-11-23 18:00:35 Re: Use index to estimate expression selectivity
Previous Message Matthias van de Meent 2023-11-23 17:16:31 Questions regarding Index AMs and natural ordering