Re: row estimate for partial index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Harmen <harmen(at)lijzij(dot)de>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: row estimate for partial index
Date: 2023-01-14 16:23:07
Message-ID: 1008139.1673713387@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Harmen <harmen(at)lijzij(dot)de> writes:
> Works well enough. However, we now have an org_id which has > 10% of the rows,
> but only a handful rows where "deleted is null" matches (so the org has a lot
> of "deleted" contacts). The planner doesn't like this and it falls back to a
> full table scan for the above query.

> I've added a dedicated index just for that org_id, to see if that helps:
> "org123" btree (id) WHERE deleted IS NULL AND org_id = 123
> The planner seems to use it now, however the row estimate is way off:

Yeah, so that indicates that it isn't producing a good selectivity
estimate for the combination of those two conditions: it will assume
the org_id and deleted columns are independent, which per your statements
they are not.

If you are running a reasonably recent PG version you should be able to
fix that by setting up "extended statistics" on that pair of columns:

https://www.postgresql.org/docs/current/planner-stats.html#PLANNER-STATS-EXTENDED

(I might be wrong, but I think that will help even when one of
the troublesome conditions is a null-check. If it doesn't, then
we have something to improve there ...)

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message hamann.w 2023-01-14 16:32:56 SQL question
Previous Message Tom Lane 2023-01-14 16:17:24 Re: Why is a hash join preferred when it does not fit in work_mem