From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Statistics and selectivity estimation for ranges |
Date: | 2012-08-21 13:25:02 |
Message-ID: | 50338C2E.5060908@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 20.08.2012 00:31, Alexander Korotkov wrote:
> On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas<
> heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>
>> On 15.08.2012 11:34, Alexander Korotkov wrote:
>>
>>> Ok, we've to decide if we need "standard" histogram. In some cases it can
>>> be used for more accurate estimation of< and> operators.
>>> But I think it is not so important. So, we can replace "standard"
>>> histogram
>>> with histograms of lower and upper bounds?
>>
>> Yeah, I think that makes more sense. The lower bound histogram is still
>> useful for< and> operators, just not as accurate if there are lots of
>> values with the same lower bound but different upper bound.
>
> New version of patch.
> * Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and
> upper bounds histograms combined into single ranges array, instead
> of STATISTIC_KIND_HISTOGRAM.
Ah, that's an interesting approach. So essentially, the histogram looks
just like a normal STATISTIC_KIND_HISTOGRAM histogram, but the values
stored in it are not picked the usual way. The usual way would be to
pick N evenly-spaced values from the column, and store those. Instead,
you pick N evenly-spaced lower bounds, and N evenly-spaced upper bounds,
and construct N range values from those. Looking at a single value in
the histogram, its lower bound comes from a different row than its upper
bound.
That's pretty clever - the histogram has a shape and order that's
compatible with a histogram you'd get with the standard scalar
typanalyze function. In fact, I think you could just let the standard
scalar estimators for < and > to use that histogram as is. Perhaps we
should use STATISTIC_KIND_HISTOGRAM for this after all...
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Sachin Srivastava | 2012-08-21 13:34:40 | PostgreSQL 9.2beta4 (& git HEAD) server crash on creating extension plpython3u |
Previous Message | Michael Paquier | 2012-08-21 12:13:28 | Re: multi-master pgbench? |