From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Statistics and selectivity estimation for ranges |
Date: | 2012-08-27 13:00:55 |
Message-ID: | 503B6F87.8080509@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 24.08.2012 18:51, Heikki Linnakangas wrote:
> On 20.08.2012 00:31, Alexander Korotkov wrote:
>> New version of patch.
>> * Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and
>> upper bounds histograms combined into single ranges array, instead
>> of STATISTIC_KIND_HISTOGRAM.
>
> One worry I have about that format for the histogram is that you
> deserialize all the values in the histogram, before you do the binary
> searches. That seems expensive if stats target is very high. I guess you
> could deserialize them lazily to alleviate that, though.
>
>> * Selectivity estimations for>,>=,<,<=,<<,>>,&<,&> using this
>> histogram.
>
> Thanks!
>
> I'm going to do the same for this that I did for the sp-gist patch, and
> punt on the more complicated parts for now, and review them separately.
> Attached is a heavily edited version that doesn't include the length
> histogram, and consequently doesn't do anything smart for the &< and &>
> operators. && is estimated using the bounds histograms. There's now a
> separate stakind for the empty range fraction, since it's not included
> in the length-histogram.
>
> I tested this on a dataset containing birth and death dates of persons
> that have a wikipedia page, obtained from the dbpedia.org project. I can
> send a copy if someone wants it. The estimates seem pretty accurate.
>
> Please take a look, to see if I messed up something.
Committed this with some further changes.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-08-27 13:18:09 | Re: Intermittent regression test failures from index-only plan changes |
Previous Message | Dirk Lutzebäck | 2012-08-27 12:31:15 | hunspell and tsearch2 ? |