Quick Links

Re: single table - fighting a seq scan

From:	Michael Lewis <mlewis(at)entrata(dot)com>
To:	Radoslav Nedyalkov <rnedyalkov(at)gmail(dot)com>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: single table - fighting a seq scan
Date:	2020-07-14 18:45:38
Message-ID:	CAHOFxGqfRa7uPkG2dEv5U2rt_ZLORVvBYGXLK7t7ZG8HGQgeeQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

rows=3832
rows=3870

Your estimate changed very little when you included 100 values vs 200
values. That is interesting to me.

What does the below query give you? How many of those 200 values are found
in the MCVs list? If n_distinct is low, and most of the values are NOT in
the most common value list, and the fraction of the table covered by the
MCVs is also low, then the planner will expect that each of the 200 values
represents some deceivingly high portion of the table.

You said there are 80 million rows, yes? That seems likely that ndistinct
and the MCVs list are not giving info very correlated with reality. You may
want to increase the minimum table size for sequential to kick in. I cannot
recall the name of that setting at the moment. You may also want to
increase stats target on that column at least, analyze, and explain the
query again.

SELECT
( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,
tablename,
attname,
inherited,
null_frac,
n_distinct,
array_length(most_common_vals,1) n_mcv,
array_length(histogram_bounds,1) n_hist,
correlation,
*
FROM pg_stats
WHERE
schemaname = 'public'
AND tablename='te'
AND attname='current_pid';

In response to

single table - fighting a seq scan at 2020-07-14 17:29:03 from Radoslav Nedyalkov

Responses

Re: single table - fighting a seq scan at 2020-07-14 19:31:39 from Radoslav Nedyalkov

Browse pgsql-general by date

	From	Date	Subject
Next Message	Radoslav Nedyalkov	2020-07-14 19:31:39	Re: single table - fighting a seq scan
Previous Message	Marc Millas	2020-07-14 18:43:51	Re: how to "explain" some ddl