Quick Links

Re: n_distinct off by a factor of 1000

From:	Fabio Pardi <f(dot)pardi(at)portavita(dot)eu>
To:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: n_distinct off by a factor of 1000
Date:	2020-06-23 14:05:49
Message-ID:	486df039-f581-411c-c169-6d87ce81bcf6@portavita.eu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 23/06/2020 14:42, Klaudie Willis wrote:
> I got my first hint of why this problem occurs when I looked at the statistics. For the column in question, "instrument_ref" the statistics claimed it to be:
>
> The default_statistics_target=500, and analyze has been run.
> select * from pg_stats where attname like 'instr%_ref'; -- Result: *40.000*
> select count(distinct instrumentid_ref) from bigtable -- Result: *33 385 922 (!!)*
>
> That is an astonishing difference of almost a 1000X.
>

I think you are counting 2 different things here.

The first query returns all the columns "like 'instr%_ref'" present in the statistics (so in the whole cluster), while the second is counting the actual number of different rows in bigtable.

regards,

fabio pardi

In response to

n_distinct off by a factor of 1000 at 2020-06-23 12:42:18 from Klaudie Willis

Responses

Re: n_distinct off by a factor of 1000 at 2020-06-23 14:14:14 from Adrian Klaver

Browse pgsql-general by date

	From	Date	Subject
Next Message	Adrian Klaver	2020-06-23 14:14:14	Re: n_distinct off by a factor of 1000
Previous Message	Adrian Klaver	2020-06-23 13:56:35	Re: pg_dump empty tables