Re: Planning performance problem (67626.278ms)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jeremy Schneider <schnjere(at)amazon(dot)com>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Planning performance problem (67626.278ms)
Date: 2021-04-21 14:14:21
Message-ID: CAApHDvq9o1J45is6w-3Gv1UgtuDQ4eQP4UhFPQSnQmEH_EijvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 22 Apr 2021 at 00:03, Jeremy Schneider <schnjere(at)amazon(dot)com> wrote:
>
> Two years later, I still remember this. And today I just confirmed
> someone hitting this on open source PG13.

The only thing that changed about get_actual_variable_range() is that
it now uses a SnapshotNonVacuumable snapshot. Previously a
long-running transaction could have caused vacuum to be unable to
remove tuples which could have caused get_actual_variable_range() to
be slow if it had to skip the unvacuumable tuples.

That's now changed as the SnapshotNonVacuumable will see any tuples
required by that long-running transaction and use that to determine
the range instead of skipping over it.

Anyone with a large number of tuples that vacuum can remove that are
at either end of the range on a column that is indexed by a btree
index could still have issues. Vacuuming more often might be a good
thing to consider. With the original report on this thread there were
more dead tuples in the table than live tuples. Disabling auto-vacuum
or tuning it so it waits that long is likely a bad idea.

FWIW, here's a simple test case that shows the problem in current master.

create table a (a int primary key) with (autovacuum_enabled = off);
insert into a select x from generate_series(1,10000000) x;
analyze a;
delete from a;
\timing on
explain select * from a where a < 10000000;
QUERY PLAN
------------------------------------------------------------
Seq Scan on a (cost=0.00..169247.71 rows=9998977 width=4)
Filter: (a < 10000000)
(2 rows)

Time: 9062.600 ms (00:09.063)

vacuum a;
explain select * from a where a < 10000000;
QUERY PLAN
-------------------------------------------------
Seq Scan on a (cost=0.00..0.00 rows=1 width=4)
Filter: (a < 10000000)
(2 rows)

Time: 2.665 ms

Notice that it became faster again after I did a vacuum.

David

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2021-04-21 14:37:05 Re: Planning performance problem (67626.278ms)
Previous Message Pavan Pusuluri 2021-04-21 01:45:16 Re: OLEDB for PostgreSQL