Quick Links

Re: n_distinct off by a factor of 1000

From:	Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To:	Fabio Pardi <f(dot)pardi(at)portavita(dot)eu>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: n_distinct off by a factor of 1000
Date:	2020-06-23 14:14:14
Message-ID:	3c310268-52f2-b960-8efd-bc15b08da7a7@aklaver.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 6/23/20 7:05 AM, Fabio Pardi wrote:
>
> On 23/06/2020 14:42, Klaudie Willis wrote:
>> I got my first hint of why this problem occurs when I looked at the
>> statistics. For the column in question, "instrument_ref" the
>> statistics claimed it to be:
>>
>> The default_statistics_target=500, and analyze has been run.
>> select * from pg_stats where attname like 'instr%_ref'; -- Result:
>> *40.000*
>> select count(distinct instrumentid_ref) from bigtable -- Result: *33
>> 385 922 (!!)*
>>
>> That is an astonishing difference of almost a 1000X.
>>
>
> I think you are counting 2 different things here.
>
> The first query returns all the columns "like 'instr%_ref'" present in
> the statistics (so in the whole cluster), while the second is counting
> the actual number of different rows in bigtable.

I believe the OP actually meant the query to be:

select n_distinct from pg_stats where attname like 'instr%_ref';

>
>
> regards,
>
> fabio pardi

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Re: n_distinct off by a factor of 1000 at 2020-06-23 14:05:49 from Fabio Pardi

Responses

Re: n_distinct off by a factor of 1000 at 2020-06-23 15:34:45 from Klaudie Willis

Browse pgsql-general by date

	From	Date	Subject
Next Message	Adrian Klaver	2020-06-23 14:28:34	Re: pg_dump empty tables
Previous Message	Fabio Pardi	2020-06-23 14:05:49	Re: n_distinct off by a factor of 1000