Re: single table - fighting a seq scan

From: Radoslav Nedyalkov <rnedyalkov(at)gmail(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: single table - fighting a seq scan
Date: 2020-07-14 19:31:39
Message-ID: CANhtRiYzw79GB49fvK3GEn5X9i=sbeVH2SOOKe0nM=Hd7yt2Mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Michael,
full output from the query is attached.
here is the truncated lists version.

frac_mcv | 0.00306267
tablename | te
attname | current_pid
inherited | f
null_frac | 0.261823
n_distinct | 1.59236e+07
n_mcv | 560
n_hist | 5001
correlation | 0.995502
schemaname | public
tablename | te
attname | current_pid
inherited | f
null_frac | 0.261823
avg_width | 8
n_distinct | 1.59236e+07
most_common_vals |
{15026003,24951186,220707698,223344198,224236736,224355865,224359830,224371584,224380154,224382722,224388639,224390209,224394943,224396835,228259607,232148477,232173137,232379194,232729185,232913953,236304699,236797618,237355501,238860629,239082658}
most_common_freqs |
{4.46667e-05,1.93333e-05,4.66667e-06,4.66667e-06,...,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06}
histogram_bounds |
{13426761,13467316,13510844,..239215632,239302648,239371125,239466529,239532095,239571468,239611801,239634487}
correlation | 0.995502
most_common_elems | (null)
most_common_elem_freqs | (null)
elem_count_histogram | (null)

Thank You very much for the response.
I'll try with settings as you propose.

Rado

On Tue, Jul 14, 2020 at 9:46 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:

> rows=3832
> rows=3870
>
> Your estimate changed very little when you included 100 values vs 200
> values. That is interesting to me.
>
> What does the below query give you? How many of those 200 values are found
> in the MCVs list? If n_distinct is low, and most of the values are NOT in
> the most common value list, and the fraction of the table covered by the
> MCVs is also low, then the planner will expect that each of the 200 values
> represents some deceivingly high portion of the table.
>
> You said there are 80 million rows, yes? That seems likely that ndistinct
> and the MCVs list are not giving info very correlated with reality. You may
> want to increase the minimum table size for sequential to kick in. I cannot
> recall the name of that setting at the moment. You may also want to
> increase stats target on that column at least, analyze, and explain the
> query again.
>
> SELECT
> ( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,
> tablename,
> attname,
> inherited,
> null_frac,
> n_distinct,
> array_length(most_common_vals,1) n_mcv,
> array_length(histogram_bounds,1) n_hist,
> correlation,
> *
> FROM pg_stats
> WHERE
> schemaname = 'public'
> AND tablename='te'
> AND attname='current_pid';
>
>>

Attachment Content-Type Size
out.txt text/plain 105.1 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc Millas 2020-07-14 19:31:58 Re: how to "explain" some ddl
Previous Message Michael Lewis 2020-07-14 18:45:38 Re: single table - fighting a seq scan