From: | Greg Nolle <greg(dot)nolle(at)voidbridge(dot)com> |
---|---|
To: | Michael Lewis <mlewis(at)entrata(dot)com> |
Cc: | PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Inaccurate (sometimes wildly so) row estimates for simple join |
Date: | 2020-05-22 14:42:46 |
Message-ID: | CA+wHfn-iS3D5gLQG8fJrJvOZvT7v-dWKJuLPg+qPj=3uv_7RXg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 22 May 2020 at 15:30:42, Michael Lewis (mlewis(at)entrata(dot)com) wrote:
On Fri, May 22, 2020 at 7:27 AM Greg Nolle <greg(dot)nolle(at)voidbridge(dot)com>
wrote:
> The crux seems to be that test_b does not have an even distribution for
> a_id values: it only has records for two of the values in the referenced
> table. This is how our real dataset is too and isn’t something we can
> really change.
>
How does it perform with extended statistics created for these two columns?
https://www.postgresql.org/docs/current/sql-createstatistics.html
I forgot to mention that I had already tried that actually, and it sadly
made no difference. I tried both dependencies and ndistinct modes and the
estimates remained the same. Having read through the docs I think that’s to
be expected since dependencies is designed to help when you have multiple
predicates and ndistinct when you’re doing grouping. Neither of which is
the case with my query.
For good measure I just tried mcv mode on PG12 with the test script I
posted and that also made no difference.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-05-22 14:44:09 | Re: btree_gist extension - gbt_cash_union return type |
Previous Message | Michael Lewis | 2020-05-22 14:30:16 | Re: Inaccurate (sometimes wildly so) row estimates for simple join |