From: | "Serbin, Ilya" <iserbin(at)bostonsd(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #16280: dead tuples (probably) effect plan and query performance |
Date: | 2020-02-28 09:25:22 |
Message-ID: | CALTXVijBNY+yqvDfRCmL_2ZAVmJN_hEvJvWLJ2mxJiFoTD=oZA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello, Tom!
Thanks for your answer. My concern is that plan changes after a relatively
small number of dead tuples. Bad plan is being generated when table1
contain 300-400 dead tuples. It is only 0.07%-0.08% of the whole table
(409k+ entries).
In addition, table is growing and currently there are 425k of entries.
However even on 425k size table plan stays good until number of dead tuples
reaches 300-400, аfter that plan changes to the bad one.
As I said, I tried analyzing table with various default_statistics_target
(100-1000 with step of 100) - plan stays bad. Tried setting
random_page_cost=0.1 and seq_page_cost=1 (2, 3, 4, etc). Plan changed to
good one only starting from random_page_cost=0.1 and seq_page_cost=8.
However, once I ran vacuum - plan changed to good one and stayed the same
even when I set random_page_cost=30;set seq_page_cost=1;
I realize that I can set autovacuum thresholds for this table to trigger it
once dead tuples reach 300, but it doesn't seem right to me (this number of
changes happens in something like 5 minutes and tables is ~2GB size as of
now).
Why does such a small amount (0.07%) of dead tuples changes cost
estimations so dramatically? Or am I missing something and dead tuples has
nothing to do with it?
Thanks in advance,
Ilya
чт, 27 февр. 2020 г. в 18:41, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> > Issue description:
> > After some minimal activity in database first plan changes to the second.
> > Analyze on table1 do not help (tried with various
> default_statistics_target
> > values).
> > content_idx index recreation helps for some time, but several minutes
> later
> > plan degrades back to second one.
> > The only thing helped (surprisingly) is vacuum. It also helps for some
> time,
> > but once number of dead tuples reaches something like 300-500 - plan
> > changes back to second one.
>
> Surely it is *not* a bug that dead tuples affect the plan choice.
> The density of live tuples is an important factor in the relative
> costs of different table scan techniques.
>
> In the case at hand, I wonder why your rowcount estimate is off
> by a factor of 50:
>
> -> Bitmap Index Scan on content_idx (cost=0.00..155.07 rows=409
> width=0) (actual time=4.932..4.932 rows=21952 loops=1)
> Index Cond: (content @> '{"anotherjsonkey": {"values":
> ["13"]}}'::jsonb)
> Buffers: shared hit=48
>
> If you can't improve that you're not likely to get a good plan, and
> futzing around with cost factors to make this particular query do
> "the right thing" anyway is inevitably going to make things worse
> for other queries. Maybe a larger stats target for the content column
> would help, but I fear that this @> condition is just beyond
> the ability of the planner to estimate. You might need to redesign
> the data representation to make it a bit more SQL-friendly.
>
> regards, tom lane
>
--
Илья Сербин | Старший Администратор БД
iserbin(at)bostonsd(dot)ru
Мобильный : 8-918-895-05-96 | Telegram : @iserbin_61
<https://t.me/iserbin_61>
DBI 24/7 контакт : 8-800-333-65-79 | Сайт : http://dbi.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Sandeep Thakkar | 2020-02-28 13:13:52 | Re: BUG #16274: Repeated Libraries in Mac |
Previous Message | Juan José Santamaría Flecha | 2020-02-28 09:15:45 | Re: BUG #15858: could not stat file - over 4GB |