From: | Christopher Baines <mail(at)cbaines(dot)net> |
---|---|
To: | Michael Lewis <mlewis(at)entrata(dot)com> |
Cc: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Slow recursive CTE query questions, with row estimate and n_distinct issues |
Date: | 2020-12-29 22:40:39 |
Message-ID: | 875z4k9qxk.fsf@cbaines.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Michael Lewis <mlewis(at)entrata(dot)com> writes:
> On Mon, Dec 28, 2020 at 7:51 AM Christopher Baines <mail(at)cbaines(dot)net> wrote:
>
>> derivation_inputs:
>> COUNT(*): 285422539
>> reltuples: 285422528
>>
>> derivation_id:
>> COUNT(DISTINCT): 7508610
>> n_distinct: 4336644 (~57% of the true value)
>>
>> derivation_output_id:
>> COUNT(DISTINCT): 5539406
>> n_distinct: 473762 (~8% of the true value)
>>
>
> If you expect the ratio of distinct of derivation_output_id values to be
> roughly linear going forward, you can set a custom value for n_distinct on
> the column (currently seems like -.0194, aka distinct count
> of derivation_output_id divided by reltuples of the table). You could also
> do this analysis every month or six and set the custom value as needed.
>
> https://www.postgresql.org/docs/current/sql-altertable.html
>
> I am not sure if it will resolve your query problems though.
Thanks Michael, I didn't realise a custom value could be set, but I'll
look in to this.
I actually managed to speed the query up enough by increasing
work_mem/shared_buffers. I didn't realise one of the sequential scans
was executing 14 times, but giving PostgreSQL more resources means it
just executes once, which really helps.
Thanks again,
Chris
From | Date | Subject | |
---|---|---|---|
Next Message | Christopher Baines | 2020-12-29 22:45:55 | Re: Slow recursive CTE query questions, with row estimate and n_distinct issues |
Previous Message | Greg Spiegelberg | 2020-12-28 16:09:09 | Re: Slow recursive CTE query questions, with row estimate and n_distinct issues |