Re: Slow recursive CTE query questions, with row estimate and n_distinct issues

From: Michael Lewis <mlewis(at)entrata(dot)com>
To: Christopher Baines <mail(at)cbaines(dot)net>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Slow recursive CTE query questions, with row estimate and n_distinct issues
Date: 2020-12-28 15:12:40
Message-ID: CAHOFxGo2qtDSbBh6xqQ-DZpCRSR=d0sXdU9xAOyTDxH-iPVo4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Dec 28, 2020 at 7:51 AM Christopher Baines <mail(at)cbaines(dot)net> wrote:

> derivation_inputs:
> COUNT(*): 285422539
> reltuples: 285422528
>
> derivation_id:
> COUNT(DISTINCT): 7508610
> n_distinct: 4336644 (~57% of the true value)
>
> derivation_output_id:
> COUNT(DISTINCT): 5539406
> n_distinct: 473762 (~8% of the true value)
>

If you expect the ratio of distinct of derivation_output_id values to be
roughly linear going forward, you can set a custom value for n_distinct on
the column (currently seems like -.0194, aka distinct count
of derivation_output_id divided by reltuples of the table). You could also
do this analysis every month or six and set the custom value as needed.

https://www.postgresql.org/docs/current/sql-altertable.html

I am not sure if it will resolve your query problems though.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Spiegelberg 2020-12-28 16:09:09 Re: Slow recursive CTE query questions, with row estimate and n_distinct issues
Previous Message Christopher Baines 2020-12-28 14:50:55 Slow recursive CTE query questions, with row estimate and n_distinct issues