Quick Links

Re: Bogus ANALYZE results for an otherwise-unique column with many nulls

From:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Andreas Joseph Krogh <andreas(at)visena(dot)com>
Subject:	Re: Bogus ANALYZE results for an otherwise-unique column with many nulls
Date:	2016-08-05 17:40:53
Message-ID:	87oa57dsnn.fsf@news-spur.riddles.org.uk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Also, the way that the value is calculated in the
Tom> samples-not-all-distinct case corresponds to the way I have it in
Tom> the patch.

Ahh, gotcha. You're referring to this:

/*
* If we estimated the number of distinct values at more than 10% of
* the total row count (a very arbitrary limit), then assume that
* stadistinct should scale with the row count rather than be a fixed
* value.
*/
if (stats->stadistinct > 0.1 * totalrows)
stats->stadistinct = -(stats->stadistinct / totalrows);

where "totalrows" includes nulls obviously. So this expects negative
stadistinct to be scaled by the total table size, and the all-distinct
case should do the same.

Objection withdrawn.

--
Andrew (irc:RhodiumToad)

In response to

Re: Bogus ANALYZE results for an otherwise-unique column with many nulls at 2016-08-05 14:27:41 from Tom Lane

Responses

Re: Bogus ANALYZE results for an otherwise-unique column with many nulls at 2016-08-05 20:48:34 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Claudio Freire	2016-08-05 17:43:37	Re: Heap WARM Tuples - Design Draft
Previous Message	Tom Lane	2016-08-05 17:40:04	Re: Re: [sqlsmith] FailedAssertion("!(k == indices_count)", File: "tsvector_op.c", Line: 511)