slow get_actual_variable_range with long running transactions

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: slow get_actual_variable_range with long running transactions
Date: 2020-06-16 15:11:55
Message-ID: db7111f2-05ef-0ceb-c013-c34adf4f4121@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

We're having an issue with planner performance when doing large deletes at the same time as we have long running transactions, from what we gathered, because of the scan to find the actual minimum and maximum values of the table.

Instead of trying to explain what happens, here is a very simple example:

(Our load is very similar to this scenario)

The "test" table has one column with an index on it.

Session 1:
=# insert into test select generate_series(1,10000000);

Session 2: do the long running transaction:
=# begin;
=# do_whatever_to_get_a_long_running_transaction

Session 1:
=# delete from test where a>1000000;
=# analyze test;
=# explain select * from test where a > 11000000;
QUERY PLAN
----------------------------------------------------------------------
Index Only Scan using idxa on test (cost=0.42..4.44 rows=1 width=4)
Index Cond: (a > 11000000)
(2 rows)

Time: 2606,068 ms (00:02,606)

Of course, what happens here is that the histogram says that max(a) is 1000000, and get_actual_variable_range verifies the real upper bound. And has to read quite a few dead index records.

Is there something to do to avoid the problem (except for the long running transaction, which unfortunately is out of our control) ?

Regards

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-06-16 15:20:07 Re: Parallel Seq Scan vs kernel read ahead
Previous Message Georgios 2020-06-16 14:51:31 Use TableAm API in pg_table_size