| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Harmen <harmen(at)lijzij(dot)de> |
| Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
| Subject: | Re: row estimate for partial index |
| Date: | 2023-01-14 16:23:07 |
| Message-ID: | 1008139.1673713387@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Harmen <harmen(at)lijzij(dot)de> writes:
> Works well enough. However, we now have an org_id which has > 10% of the rows,
> but only a handful rows where "deleted is null" matches (so the org has a lot
> of "deleted" contacts). The planner doesn't like this and it falls back to a
> full table scan for the above query.
> I've added a dedicated index just for that org_id, to see if that helps:
> "org123" btree (id) WHERE deleted IS NULL AND org_id = 123
> The planner seems to use it now, however the row estimate is way off:
Yeah, so that indicates that it isn't producing a good selectivity
estimate for the combination of those two conditions: it will assume
the org_id and deleted columns are independent, which per your statements
they are not.
If you are running a reasonably recent PG version you should be able to
fix that by setting up "extended statistics" on that pair of columns:
https://www.postgresql.org/docs/current/planner-stats.html#PLANNER-STATS-EXTENDED
(I might be wrong, but I think that will help even when one of
the troublesome conditions is a null-check. If it doesn't, then
we have something to improve there ...)
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | hamann.w | 2023-01-14 16:32:56 | SQL question |
| Previous Message | Tom Lane | 2023-01-14 16:17:24 | Re: Why is a hash join preferred when it does not fit in work_mem |