Quick Links

Re: Understanding histograms

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	len(at)pdx(dot)edu, len(at)cs(dot)pdx(dot)edu, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Understanding histograms
Date:	2008-04-30 23:17:44
Message-ID:	26230.1209597464@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Wed, 2008-04-30 at 10:43 -0400, Tom Lane wrote:
>> Surely that's not very sane? The MCV list plus histogram generally
>> don't include every value in the table.

> My understanding of Len's question is that, although the MCV list plus
> the histogram don't include every distinct value in the general case,
> they do include every value in the specific case where the histogram is
> not full.

I don't believe that's true. It's possible that a small histogram means
that you are seeing every value that was in ANALYZE's sample, but it's
a mighty long leap from that to the assumption that there are no other
values in the table. In any case that seems more an artifact of the
implementation than a property the histogram would be guaranteed to
have.

> ... the statistics aren't guaranteed to be perfectly up-to-date, so an
> estimate of zero might be risky.

Right. As a matter of policy we never estimate less than one matching
row; and I've seriously considered pushing that up to at least two rows
except when we see that the query condition matches a unique constraint.
You can get really bad join plans from overly-small estimates.

regards, tom lane

In response to

Re: Understanding histograms at 2008-04-30 22:47:02 from Jeff Davis

Responses

Re: Understanding histograms at 2008-05-01 00:53:44 from Gregory Stark

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Gregory Stark	2008-05-01 00:53:44	Re: Understanding histograms
Previous Message	Jeff Davis	2008-04-30 22:47:02	Re: Understanding histograms