From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-performance(at)postgresql(dot)org, Andreas Kretschmer <akretschmer(at)spamfence(dot)net> |
Subject: | Re: BETWEEN optimizer problems with single-value |
Date: | 2006-03-16 02:05:24 |
Message-ID: | 11428.1142474724@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>> ISTM that when the BETWEEN constants match we end up in this part of
>> clauselist_selectivity()...
Yeah, I think you are right.
> so that the planner underestimates the cost of using "Cal_CalDate" so
> that it ends up the same as "Cal_CtofcNo", and then we pick
> "Cal_CalDate" because it was created first.
No, it doesn't end up the same --- but the difference is small enough to
be in the roundoff-error regime. The real issue here is that we're
effectively assuming that one row will be fetched from the index in both
cases, and this is clearly not the case for the Cal_CalDate index. So
we need a more accurate estimate for the boundary case.
> Using 1.0e-10 isn't very useful... the selectivity for a range should
> never be less than the selectivity for an equality, so we should simply
> put in a test against one of the pseudo constants and use that as the
> minimal value.
That's easier said than done, because you'd first have to find the
appropriate equality operator to use (ie, one having semantics that
agree with the inequality operators). Another point is that the above
statement is simply wrong, consider
calDate BETWEEN '2006-03-15' AND '2006-03-14'
for which an estimate of zero really is correct.
Possibly we could drop this code's reliance on seeing
SCALARLTSEL/SCALARGTSEL as the estimators, and instead try to locate a
common btree opclass for the operators --- which would then let us
identify the right equality operator to use, and also let us distinguish
> from >= etc. If we're trying to get the boundary cases right I
suspect we have to account for that. I could see such an approach being
tremendously slow though :-(, because we'd go looking for btree
opclasses even for operators that have nothing to do with < or >.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-03-16 02:55:36 | Re: Static build of psql with readline support |
Previous Message | Christopher Kings-Lynne | 2006-03-16 01:27:19 | Static build of psql with readline support |
From | Date | Subject | |
---|---|---|---|
Next Message | Guillaume Smet | 2006-03-16 10:45:12 | PostgreSQL and Xeon MP |
Previous Message | Jan de Visser | 2006-03-16 00:42:00 | Re: Slow SELECTS after large update cycle |