From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andreas Joseph Krogh <andreas(at)visena(dot)com> |
Subject: | Re: Bogus ANALYZE results for an otherwise-unique column with many nulls |
Date: | 2016-08-07 07:01:40 |
Message-ID: | CAEZATCVQ9AGw1thJiViYWHXXZ46_p6FfDPBeyTC9BSNDz+6L6g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 5 August 2016 at 21:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> OK, thanks. What shall we do about Andreas' request to back-patch this?
> I'm personally willing to do it, but there is the old bugaboo of "maybe
> it will destabilize a plan that someone is happy with".
>
My inclination would be to back-patch it because arguably it's a
bug-fix -- at the very least the old behaviour didn't match the docs
for stadistinct:
The number of distinct nonnull data values in the column.
A value greater than zero is the actual number of distinct values.
A value less than zero is the negative of a multiplier for the number
of rows in the table; for example, a column in which values appear about
twice on the average could be represented by
<structfield>stadistinct</> = -0.5.
Additionally, I think that example is misleading because it's only
really true if there are no null values in the column. Perhaps it
would help to have a more explicit example to illustrate how nulls
affect stadistinct, for example:
... for example, a column in which about 80% of the values are nonnull
and each nonnull value appears about twice on average could be
represented by <structfield>stadistinct</> = -0.4.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Joseph Krogh | 2016-08-07 08:16:45 | Re: Bogus ANALYZE results for an otherwise-unique column with many nulls |
Previous Message | Thomas Munro | 2016-08-07 04:45:39 | Consolidate 'unique array values' logic into a reusable function? |