RE: [HACKERS] Proposed cleanup of index-related planner estimation procedures

From: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: RE: [HACKERS] Proposed cleanup of index-related planner estimation procedures
Date: 2000-01-07 00:09:58
Message-ID: 000501bf58a3$88a213a0$2801007e@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: owner-pgsql-hackers(at)postgreSQL(dot)org
> [mailto:owner-pgsql-hackers(at)postgreSQL(dot)org]On Behalf Of Tom Lane
>
> I am thinking about redefining and simplifying the planner's interface
> to index-type-dependent estimation routines.
>
> Currently, each defined index type is supposed to supply two routines:
> an "amopselect" routine and an "amopnpages" routine. (The existing
> actual routines of this kind are btreesel, btreenpages, etc in
> src/backend/utils/adt/selfuncs.c.) These things are called by
> index_selectivity() in src/backend/optimizer/util/plancat.c. amopselect
> tries to determine the selectivity of an indexqual (the fraction of
> main-table tuples it will select) and amopnpages tries to determine
> the number of index pages that will be read to do it.
>
> Now, this collection of code is largely redundant with
> optimizer/path/clausesel.c, which also tries to estimate the selectivity
> of qualification conditions. Furthermore, the interface to these
> routines is fundamentally misdesigned, because there is no way to deal
> with interrelated qualification conditions --- for example, if we have
> a range query like "... WHERE x > 10 AND x < 20", the code estimates
> the selectivity as the product of the selectivities of the two terms
> independently, but the actual selectivity is very different from that.
> I am working on fixing clausesel.c to be smarter about correlated
> conditions, but it won't do much good to fix that code without fixing
> the index-related code.
>
> What I'm thinking about doing is replacing these two per-index-type
> routines with a single routine, which is called once per proposed
> indexscan rather than once per qual clause. It would receive the
> whole indexqual list as a parameter, instead of just one qual.
> A typical implementation would just call clausesel.c's general-purpose
> code to estimate the selectivity, and then do a little bit of extra
> work to derive the estimated number of index pages from that number.
>

Seems good to me.

I have also been suspicious about per qual selectivity and have
another exmaple.
For the following query
select * from .. where col1=val1 and col2=val2;

the selectivity is selectivity of (col1=val1) * selectivity of (col2=val2)
currently. But it's not right in many cases.

Though it's almost impossible to hold disbursions for all combination
of columns,it may be possible to hold multi-column disbursions for
multi-columns indexes,

Regards.

Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2000-01-07 00:24:35 Re: [HACKERS] [Fwd: Re: First Major Open Source Database]
Previous Message Hannu Krosing 2000-01-06 23:47:06 Re: [HACKERS] Enhancing PGSQL to be compatible with Informix SQL