From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Claudio Freire <klaussfreire(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Making clausesel.c Smarter |
Date: | 2017-04-06 23:53:31 |
Message-ID: | CAKJS1f9L-SRUMVE9Mp3zUf603_8y4iwGRVBi2Gg7baNvu8PGXA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7 April 2017 at 09:05, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Tue, Apr 4, 2017 at 1:00 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>>> If you ask me, I'd just leave:
>>>>
>>>> + if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound ==
>>>> DEFAULT_INEQ_SEL)
>>>> + {
>>>> + /* No point in checking null selectivity, DEFAULT_INEQ_SEL
>>>> implies we have no stats */
>>>> + s2 = DEFAULT_RANGE_INEQ_SEL;
>>>> + }
>>>> + else
>>>> + {
>>>> ...
>>>> + s2 = rqlist->hibound + rqlist->lobound - 1.0;
>>>> + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid);
>>>> + notnullsel = 1.0 - nullsel;
>>>> +
>>>> + /* Adjust for double-exclusion of NULLs */
>>>> + s2 += nullsel;
>>>> + if (s2 <= 0.0) {
>>>> + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6))
>>>> + {
>>>> + /* Most likely contradictory clauses found */
>>>> + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel;
>>>> + }
>>>> + else
>>>> + {
>>>> + /* Could be a rounding error */
>>>> + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
>>>> + }
>>>> + }
>>>> + }
>>>>
>>>> Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly
>>>> cautious) estimation of the amount of rounding error that could be
>>>> there with 32-bit floats.
>>>>
>>>> The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is
>>>> an estimation based on stats, maybe approximate, but one that makes
>>>> sense (ie: preserves the monotonicity of the CPF), and as such
>>>> negative values are either sign of a contradiction or rounding error.
>>>> The previous code left the possibility of negatives coming out of
>>>> default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates,
>>>> but that doesn't seem possible coming out of scalarineqsel.
>>>>
>>>> But I'd like it if Tom could comment on that, since he's the one that
>>>> did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back
>>>> in 2004).
>>>>
>>>> Barring that, I'd just change the
>>>>
>>>> s2 = DEFAULT_RANGE_INEQ_SEL;
>>>>
>>>> To
>>>>
>>>> s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
>>>>
>>>> Which makes a lot more sense.
>>>
>>> I think to fix this properly the selfuncs would have to return the
>>> estimate, and nullfrac separately, so the nullfrac could just be
>>> applied once per expression. There's already hacks in there to get rid
>>> of the double null filtering. I'm not proposing that as something for
>>> this patch. It would be pretty invasive to change this.
>>
>> There's no need, you can compute the nullfrac with nulltestsel. While
>> there's some rework involved, it's not a big deal (it just reads the
>> stats tuple), and it's a clean solution.
>
> I'm marking this Waiting on Author until we figure out what to do with
> those issues (the null issue quoted above, and the quadratic behavior)
I've attached a rebased patch as the existing one no longer applies.
I've not yet had time to wrap my head around the null handling stuff.
I'll try to get to that today.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
smarter_clausesel_2017-04-07.patch | application/octet-stream | 16.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2017-04-07 00:17:05 | Re: pgbench - allow to store select results into variables |
Previous Message | Tom Lane | 2017-04-06 23:47:50 | Re: Performance improvement for joins where outer side is unique |