From: | "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | RE: [HACKERS] Proposed cleanup of index-related planner estimation procedures |
Date: | 2000-01-07 00:09:58 |
Message-ID: | 000501bf58a3$88a213a0$2801007e@tpf.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> -----Original Message-----
> From: owner-pgsql-hackers(at)postgreSQL(dot)org
> [mailto:owner-pgsql-hackers(at)postgreSQL(dot)org]On Behalf Of Tom Lane
>
> I am thinking about redefining and simplifying the planner's interface
> to index-type-dependent estimation routines.
>
> Currently, each defined index type is supposed to supply two routines:
> an "amopselect" routine and an "amopnpages" routine. (The existing
> actual routines of this kind are btreesel, btreenpages, etc in
> src/backend/utils/adt/selfuncs.c.) These things are called by
> index_selectivity() in src/backend/optimizer/util/plancat.c. amopselect
> tries to determine the selectivity of an indexqual (the fraction of
> main-table tuples it will select) and amopnpages tries to determine
> the number of index pages that will be read to do it.
>
> Now, this collection of code is largely redundant with
> optimizer/path/clausesel.c, which also tries to estimate the selectivity
> of qualification conditions. Furthermore, the interface to these
> routines is fundamentally misdesigned, because there is no way to deal
> with interrelated qualification conditions --- for example, if we have
> a range query like "... WHERE x > 10 AND x < 20", the code estimates
> the selectivity as the product of the selectivities of the two terms
> independently, but the actual selectivity is very different from that.
> I am working on fixing clausesel.c to be smarter about correlated
> conditions, but it won't do much good to fix that code without fixing
> the index-related code.
>
> What I'm thinking about doing is replacing these two per-index-type
> routines with a single routine, which is called once per proposed
> indexscan rather than once per qual clause. It would receive the
> whole indexqual list as a parameter, instead of just one qual.
> A typical implementation would just call clausesel.c's general-purpose
> code to estimate the selectivity, and then do a little bit of extra
> work to derive the estimated number of index pages from that number.
>
Seems good to me.
I have also been suspicious about per qual selectivity and have
another exmaple.
For the following query
select * from .. where col1=val1 and col2=val2;
the selectivity is selectivity of (col1=val1) * selectivity of (col2=val2)
currently. But it's not right in many cases.
Though it's almost impossible to hold disbursions for all combination
of columns,it may be possible to hold multi-column disbursions for
multi-columns indexes,
Regards.
Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp
From | Date | Subject | |
---|---|---|---|
Next Message | The Hermit Hacker | 2000-01-07 00:24:35 | Re: [HACKERS] [Fwd: Re: First Major Open Source Database] |
Previous Message | Hannu Krosing | 2000-01-06 23:47:06 | Re: [HACKERS] Enhancing PGSQL to be compatible with Informix SQL |