From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: add support for IN and @> in functional-dependency statistics use |
Date: | 2020-03-25 00:28:01 |
Message-ID: | 20200325002801.i45f34l45rqmik4v@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 19, 2020 at 07:53:39PM +0000, Dean Rasheed wrote:
>On Wed, 18 Mar 2020 at 00:29, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> OK, I took a look. I think from the correctness POV the patch is OK, but
>> I think the dependencies_clauselist_selectivity() function now got a bit
>> too complex. I've been able to parse it now, but I'm sure I'll have
>> trouble in the future :-(
>>
>> Can we refactor / split it somehow and move bits of the logic to smaller
>> functions, or something like that?
>>
>
>Yeah, it has gotten a bit long. It's somewhat tricky splitting it up,
>because of the number of shared variables used throughout the
>function, but here's an updated patch splitting it into what seemed
>like the 2 most logical pieces. The first piece (still in
>dependencies_clauselist_selectivity()) works out what dependencies
>can/should be applied, and the second piece in a new function does the
>actual work of applying the list of functional dependencies to the
>clause list.
>
>I think that has made it easier to follow, and it has also reduced the
>complexity of the final "no applicable stats" branch.
>
Seems OK to me.
I'd perhaps name deps_clauselist_selectivity differently, it's a bit too
similar to dependencies_clauselist_selectivity. Perhaps something like
clauselist_apply_dependencies? But that's a minor detail.
>> Another thing I'd like to suggest is keeping the "old" formula, and
>> instead of just replacing it with
>>
>> P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)
>>
>> but explaining how the old formula may produce nonsensical selectivity,
>> and how the new formula addresses that issue.
>>
>
>I think this is purely a comment issue? I've added some more extensive
>comments attempting to justify the formulae.
>
Yes, it was purely a comment issue. Seems fine now.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andy Fan | 2020-03-25 00:39:29 | Re: Index Skip Scan |
Previous Message | James Coleman | 2020-03-25 00:23:44 | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |