From: | "Zeugswetter Andreas SB SD" <ZeugswetterA(at)spardat(dot)at> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Josh Berkus" <josh(at)agliodbs(dot)com> |
Cc: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hypothetical suggestions for planner, indexing improvement |
Date: | 2003-05-06 15:05:44 |
Message-ID: | 46C15C39FEB2C44BA555E356FBCD6FA4961F99@m0114.s-mxs.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > The reason that I mention EXISTS is because that's where the lack of
> > cross-column corellation is most dramatic; the planner seems to estimate a
> > flat 50% for EXISTS clauses regardless of the content.
>
> No "seems to" about that one: see
> src/backend/optimizer/path/clausesel.c
>
> else if (is_subplan(clause))
> {
> /*
> * Just for the moment! FIX ME! - vadim 02/04/98
> */
> s1 = (Selectivity) 0.5;
> }
I think the main issue cannot be the correlation in this case.
In this FK PK case an events row exists for each row in the subselect,
so the correlation must be above 1, how much above is irrelevant for the
selectivity estimate (it is only relevant in the below 1 cases).
The selectivity in this case can be estimated by estimating the number of rows
returned from the subselect where the (in the example missing) PK-FK join condition
is removed (here: select distinct event_id from event_day where event_day BETWEEN
'2003-04-08' AND '2003-05-18'). (selectivity = min (1, e. rows of subselect / e. rows of
main select))
The "event_day BETWEEN '2003-04-08' AND '2003-05-18'" is what really reduces the
result set here, and that is not used.
Andreas
PS: in the example the subselect join clause event_id=events.event_id is missing
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Pflug | 2003-05-06 15:17:08 | contribute pg_get_viewdef2 et al |
Previous Message | Dave Page | 2003-05-06 14:14:57 | Re: 7.4 features list |