From: | "Dian M Fay" <dian(dot)m(dot)fay(at)gmail(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: doc: Make selectivity example match wording |
Date: | 2022-07-17 23:07:15 |
Message-ID: | CLIB4H644MFM.1Y9ADT3YL7IZ3@medusa |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat Jul 16, 2022 at 11:23 PM EDT, David G. Johnston wrote:
> Thanks for the review. I generally like everything you said but it made me
> realize that I still didn't really understand the intent behind the
> formula. I spent way too much time working that out for myself, then
> turned what I found useful into this v2 patch.
>
> It may need some semantic markup still but figured I'd see if the idea
> makes sense.
>
> I basically rewrote, in a bit different style, the same material into the
> code comments, then proceeded to rework the proof that was already present
> there.
>
> I did do this in somewhat of a vacuum. I'm not inclined to learn this all
> start-to-end though. If the abrupt style change is unwanted so be it. I'm
> not really sure how much benefit the proof really provides. The comments
> in the docs are probably sufficient for the code as well - just define why
> the three pieces of the formula exist and are packaged into a single
> multiplier called selectivity as an API choice. I suspect once someone
> gets to that comment it is fair to assume some prior knowledge.
> Admittedly, I didn't really come into this that way...
Fair enough, I only know what I can glean from the comments in
eqjoinsel_inner and friends myself. I do think even this smaller change
is valuable because the current example talks about using an algorithm
based on the number of distinct values immediately after showing
n_distinct == -1, so making it clear that this case uses num_rows
instead is helpful.
"This value does get scaled in the non-unique case" again could be more
specific ("since here all values are unique, otherwise the calculation
uses num_distinct" perhaps?). But past that quibble I'm good.
From | Date | Subject | |
---|---|---|---|
Next Message | Martin Kalcher | 2022-07-17 23:07:22 | Re: Proposal to introduce a shuffle function to intarray extension |
Previous Message | Martin Kalcher | 2022-07-17 23:05:19 | Re: Proposal to introduce a shuffle function to intarray extension |