Quick Links

Re: n_distinct off by a factor of 1000

From:	Michael Lewis <mlewis(at)entrata(dot)com>
To:	Pavel Luzanov <p(dot)luzanov(at)postgrespro(dot)ru>
Cc:	PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject:	Re: n_distinct off by a factor of 1000
Date:	2020-06-25 17:34:02
Message-ID:	CAHOFxGqo9biZf0knrZyhjFxFye9fS1Ea-BaU2rwcG3Mm8j-Bwg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Thu, Jun 25, 2020 at 7:27 AM Pavel Luzanov <p(dot)luzanov(at)postgrespro(dot)ru>
wrote:

> I have tried to increase the statistics target to 5000, and it helps, but
> it reduces the error to 100X. Still crazy high.
>
>
> As far as I know, increasing default_statistics_target will not help. [1]
>
> I have considered these fixes:
> - hardcode the statistics to a particular ratio of the total number of rows
>
>
> You can hardcode the percentage of distinct values:
> ALTER TABLE bigtable ALTER COLUMN instrument_ref SET ( n_distinct=-0.06 );
> /* -1 * (33385922 / 500000000) */
>
>
> [1]
> https://www.postgresql.org/message-id/4136ffa0812111823u645b6ec9wdca60b3da4b00499%40mail.gmail.com
>

Thanks for sharing. Very interesting read. If anyone has reference to the
papers alluded to, that would be appreciated. I had forgotten about the
option to set negative values.

In response to

Re: n_distinct off by a factor of 1000 at 2020-06-25 13:27:08 from Pavel Luzanov

Responses

Re: n_distinct off by a factor of 1000 at 2020-06-27 18:49:55 from Klaudie Willis

Browse pgsql-general by date

	From	Date	Subject
Next Message	Sri Linux	2020-06-25 17:38:39	Re: Need help with PITR for PostgreSQL 9.4.5
Previous Message	Tom Lane	2020-06-25 16:23:23	Re: Curious behaviour with "order by random()"