Re: row estimate for partial index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Harmen <harmen(at)lijzij(dot)de>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: row estimate for partial index
Date: 2023-01-16 14:59:38
Message-ID: 1860907.1673881178@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Harmen <harmen(at)lijzij(dot)de> writes:
> On Sat, Jan 14, 2023 at 11:23:07AM -0500, Tom Lane wrote:
>> If you are running a reasonably recent PG version you should be able to
>> fix that by setting up "extended statistics" on that pair of columns:

> CREATE STATISTICS dist4 (ndistinct) ON deleted, org_id FROM contactsbool;
> CREATE STATISTICS dist4b (ndistinct) ON org_id, deleted FROM contactsbool;

1. ndistinct is not the correct stats type for this problem.
(I think dependencies is, but generally speaking, it's not worth
trying to be smarter than the system about which ones you need.
Just create 'em all.)

2. Per the CREATE STATISTICS man page, the order of the columns is
not significant, so you're just doubling the amount of work for
ANALYZE without gaining anything.

I think you will find that

CREATE STATISTICS stats1 ON deleted, org_id FROM contactsbool;

is enough to fix this. It improved the estimate for me in
v14 and HEAD, anyway.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe Conway 2023-01-16 15:47:54 Re: glibc initdb options vs icu compatibility questions (PG15)
Previous Message HECTOR INGERTO 2023-01-16 14:37:23 RE: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?