From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | ivanmulhin(at)gmail(dot)com, Pg Docs <pgsql-docs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: incorrect information in documentation |
Date: | 2021-08-10 03:40:20 |
Message-ID: | CAKFQuwaV87e4sdFdZab4+zcUdza+EpveQi_98=f9wJ1+QLUALw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
On Mon, Aug 9, 2021 at 11:05 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> > selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/
> > num_distinct1,
> > 1/num_distinct2)
> > = (1 - 0) * (1 - 0) / max(10000, 10000)
> > = 0.0001
>
> Nice, can you provide a patch please?
>
>
Change the line:
selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1,
1/num_distinct2)
to be:
selectivity = (1 - null_frac1) * (1 - null_frac2) / max(num_distinct1,
num_distinct2)
The wording already talks about "divide by max".
Though:
"so we use an algorithm that relies only on the number of distinct values
for both relations together with their null fractions:"
maybe adds a parenthetical note:
"so we use an algorithm that relies only on the number of distinct values
(the row count estimate for the whole table, not the -1 in the column
statistics) for both relations together with their null fractions:"
Just note I haven't tried to absorb that whole page, let alone the
implementation, and am not all that familiar with this part of PostgreSQL.
Its seems right, though, in isolation.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | PG Doc comments form | 2021-08-17 16:11:53 | Potential vuln in example for "F.25.1.1. digest()" |
Previous Message | Bruce Momjian | 2021-08-09 18:05:50 | Re: incorrect information in documentation |