Re: single table - fighting a seq scan

From: Radoslav Nedyalkov <rnedyalkov(at)gmail(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: single table - fighting a seq scan
Date: 2020-07-14 20:04:11
Message-ID: CANhtRiaQ4HZR7D3_E3RayKH_zROJD_OSRe89eS5Y6cTrDZk6RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ah, I could have messed up the examples I gave. Row numbers are different.
Once again the plans , sorry about that.

-- 200 entries

Gather (cost=1000.00..106905910.97 rows=7893 width=8)
Workers Planned: 2
-> Parallel Seq Scan on te (cost=0.00..106904121.67 rows=3289 width=8)
Filter: (current_pid = ANY
('{240900026,240900027,240900028,240900029,240900030,240900031,240900032,240900033,240900034,240900035,240900036,240900037,240900038,240900039,240900040,240900041,240900042,240900043,240900044,240900045,240900046,240900047,240900048,240900049,240900050,240900051,240900052,240900053,240900054,240900055,240900056,240900057,240900058,240900059,240900060,240900061,240900062,240900063,240900064,240900065,240900066,240900067,240900068,240900069,240900070,240900071,240900072,240900073,240900074,240900075,240900076,240900077,240900078,240900079,240900080,240900081,240900082,240900083,240900084,240900085,240900086,240900165,240900087,240900166,240900088,240900167,240900089,240900168,240900090,240900169,240900091,240900170,240900092,240900171,240900093,240900172,240900094,240900173,240900905,240900174,240900175,240900176,240900177,240900178,240900179,240900180,240900181,240900182,240900183,240900184,240900185,240900186,240900187,240900188,240900189,240900190,240900191,240900192,240900193,240900194,240900195,240900196,240900197,240900198,240900199,240900906,240900907,240900908,240900909,240900910,240900911,240900912,240900913,240900914,240900915,240900916,240900917,240900918,240900919,240900920,240900921,240900922,240900923,240900924,240900925,240900926,240900927,240900928,240901048,240901053,240901054,240901055,240901056,240901057,240901058,240901059,240901060,240901061,240901062,240901063,240901064,240901065,240901066,240901067,240901068,240901069,240901070,240901071,240901072,240901073,240901074,240901075,240901076,240901077,240901078,240901079,240901080,240901081,240901082,240901083,240901084,240901085,240901086,240901087,240901088,240901089,240901090,240901091,240901092,240901093,240901094,240901095,240901096,240901097,240901098,240901099,240901100,240901101,240901102,240901103,240901104,240901105,240901106,240901107,240901108,240901109,240901110,240901111,240901112,240901113,240901114,240901115,240901116,240901117,240901118,240901119,240901120,240901121,240901122,240901123,240901124,240901125,240901126,240901127,240901128,240901129}'::bigint[]))
(4 rows)

Time: 5.261 ms
========================================================================

-- 100 entries
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using te_current_pid_idx on te (cost=0.57..731.26 rows=3832
width=8) (actual time=1.244..15.897 rows=600 loops=1)
Index Cond: (current_pid = ANY
('{240900026,240900027,240900028,240900029,240900030,240900031,240900032,240900033,240900034,240900035,240900036,240900037,240900038,240900039,240900040,240900041,240900042,240900043,240900044,240900045,240900046,240900047,240900048,240900049,240900050,240900051,240900052,240900053,240900054,240900055,240900056,240900057,240900058,240900059,240900060,240900061,240900062,240900063,240900064,240900065,240900066,240900067,240900068,240900069,240900070,240900071,240900072,240900073,240900074,240900075,240900076,240900077,240900078,240900079,240900080,240900081,240900082,240900083,240900084,240900085,240900086,240900165,240900087,240900166,240900088,240900167,240900089,240900168,240900090,240900169,240900091,240900170,240900092,240900171,240900093,240900172,240900094,240900173,240900905,240900174,240900175,240900176,240900177,240900178,240900179,240900180,240900181,240900182,240900183,240900184,240900185,240900186,240900187,240900188,240900189,240900190,240900191,240900192,240900193,240900194}'::bigint[]))
Planning Time: 3.430 ms
Execution Time: 15.954 ms
(4 rows)

Time: 20.257 ms

On Tue, Jul 14, 2020 at 10:31 PM Radoslav Nedyalkov <rnedyalkov(at)gmail(dot)com>
wrote:

> Hi Michael,
> full output from the query is attached.
> here is the truncated lists version.
>
> frac_mcv | 0.00306267
> tablename | te
> attname | current_pid
> inherited | f
> null_frac | 0.261823
> n_distinct | 1.59236e+07
> n_mcv | 560
> n_hist | 5001
> correlation | 0.995502
> schemaname | public
> tablename | te
> attname | current_pid
> inherited | f
> null_frac | 0.261823
> avg_width | 8
> n_distinct | 1.59236e+07
> most_common_vals |
> {15026003,24951186,220707698,223344198,224236736,224355865,224359830,224371584,224380154,224382722,224388639,224390209,224394943,224396835,228259607,232148477,232173137,232379194,232729185,232913953,236304699,236797618,237355501,238860629,239082658}
> most_common_freqs |
> {4.46667e-05,1.93333e-05,4.66667e-06,4.66667e-06,...,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06,4e-06}
> histogram_bounds |
> {13426761,13467316,13510844,..239215632,239302648,239371125,239466529,239532095,239571468,239611801,239634487}
> correlation | 0.995502
> most_common_elems | (null)
> most_common_elem_freqs | (null)
> elem_count_histogram | (null)
>
> Thank You very much for the response.
> I'll try with settings as you propose.
>
> Rado
>
> On Tue, Jul 14, 2020 at 9:46 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:
>
>> rows=3832
>> rows=3870
>>
>> Your estimate changed very little when you included 100 values vs 200
>> values. That is interesting to me.
>>
>> What does the below query give you? How many of those 200 values are
>> found in the MCVs list? If n_distinct is low, and most of the values are
>> NOT in the most common value list, and the fraction of the table covered by
>> the MCVs is also low, then the planner will expect that each of the 200
>> values represents some deceivingly high portion of the table.
>>
>> You said there are 80 million rows, yes? That seems likely that ndistinct
>> and the MCVs list are not giving info very correlated with reality. You may
>> want to increase the minimum table size for sequential to kick in. I cannot
>> recall the name of that setting at the moment. You may also want to
>> increase stats target on that column at least, analyze, and explain the
>> query again.
>>
>> SELECT
>> ( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,
>> tablename,
>> attname,
>> inherited,
>> null_frac,
>> n_distinct,
>> array_length(most_common_vals,1) n_mcv,
>> array_length(histogram_bounds,1) n_hist,
>> correlation,
>> *
>> FROM pg_stats
>> WHERE
>> schemaname = 'public'
>> AND tablename='te'
>> AND attname='current_pid';
>>
>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Eudald Valcàrcel Lacasa 2020-07-14 20:25:07 Issue executing query from container
Previous Message Adrian Klaver 2020-07-14 20:04:01 Re: Problem with FDW wrapper errors