analyze causes query planner to choose suboptimal plan for a select query in separate transaction

From: Martin Kováčik <kovacik(at)redbyte(dot)eu>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Date: 2019-04-25 14:37:46
Message-ID: CABp97hOoomefGd6HB8KOyDUdOt6+VjyNunUf_gn_gmVL2zZ2ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi group,

I have a Jenkins build server which runs integration tests of a Java
application against PostgreSQL 11.2. The tests run in serial fashion and
are reproducible because each test runs in separate DB transaction which is
rolled back at the end of each test. Each full build (with 1400+ tests)
takes approx 1h:45m.

My problem is that sometimes the build gets stuck (for several hours, often
10+) in random test waiting for random select query with postgres process
consuming 100% CPU (1 core).
Well, at least I thought it was random, then I figured out it was triggered
by autovacuum daemon. The problem is not the vacuum process itself, but in
the analyze part that runs together with autovacuum.

To illustrate my situation let's consider my tests look like this:

BEGIN;

-- A: insert data for the test

-- B: i'll refer to this point later

-- C: select(s)

ROLLBACK;

Everything is fine, until autovacuum (analyze) runs when the test is at
point B. After that the query planner at point C chooses wrong plan and the
query takes a long time to complete, blocking one CPU core for a long time.
It seems like the planner statistics inside running transaction are
affected by analyze task running outside of the transaction. In this case
after running analyze (outside the transaction) when the transaction is at
B, causes query planner to think there are no rows (because the inserts at
point A were not yet committed).

I did prepare a simple test case to reproduce this behavior:

First you need to create a table:

create table a (id bigint primary key);

Then run this transaction:

begin;
insert into a
select * from generate_series(1, 1000);

-- during sleep execute analyze on this db in separate connection
select pg_sleep(10);

explain analyze select count(*) from (
select distinct a1, a2, a3, a4 from a a1
left join a a2 on a1.id > 900
left join a a3 on a2.id = a3.id
left join a a4 on a3.id = a4.id
left join a a5 on a5.id = a4.id
left join a a6 on a6.id = a5.id
left join a a7 on a7.id = a6.id
left join a a8 on a8.id = a7.id) temp;

rollback;

The query plan for the select is as follows and the query takes 865ms to
finish.

QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1987304286213.68..1987304286213.69 rows=1 width=8)
(actual time=862.406..862.407 rows=1 loops=1)
-> Unique (cost=1942976891422.08..1987284286213.68 rows=1600000000
width=128) (actual time=783.281..858.261 rows=100900 loops=1)
-> Sort (cost=1942976891422.08..1951838370380.40
rows=3544591583328 width=128) (actual time=783.280..825.055 rows=100900
loops=1)
Sort Key: a1.*, a2.*, a3.*, a4.*
Sort Method: external merge Disk: 12368kB
-> Merge Right Join (cost=5497387.28..53175003562.02
rows=3544591583328 width=128) (actual time=171.612..196.342 rows=100900
loops=1)
Merge Cond: (a4.id = a3.id)
-> Merge Left Join (cost=449798.05..1010691.54
rows=36848656 width=40) (actual time=8.257..8.815 rows=1000 loops=1)
Merge Cond: (a4.id = a5.id)
-> Sort (cost=158.51..164.16 rows=2260
width=40) (actual time=1.174..1.241 rows=1000 loops=1)
Sort Key: a4.id
Sort Method: quicksort Memory: 103kB
-> Seq Scan on a a4 (cost=0.00..32.60
rows=2260 width=40) (actual time=0.040..0.639 rows=1000 loops=1)
-> Materialize (cost=449639.54..465944.26
rows=3260943 width=8) (actual time=7.077..7.254 rows=1000 loops=1)
-> Sort (cost=449639.54..457791.90
rows=3260943 width=8) (actual time=7.066..7.148 rows=1000 loops=1)
Sort Key: a5.id
Sort Method: quicksort Memory: 71kB
-> Merge Right Join
(cost=2897.82..52270.24 rows=3260943 width=8) (actual time=4.352..6.630
rows=1000 loops=1)
Merge Cond: (a7.id = a6.id)
-> Merge Left Join
(cost=317.01..711.38 rows=25538 width=8) (actual time=1.236..2.482
rows=1000 loops=1)
Merge Cond: (a7.id =
a8.id)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.618..0.792 rows=1000
loops=1)
Sort Key: a7.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan on a
a7 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.017..0.265
rows=1000 loops=1)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.605..0.773 rows=1000
loops=1)
Sort Key: a8.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan on a
a8 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.024..0.261
rows=1000 loops=1)
-> Sort
(cost=2580.81..2644.65 rows=25538 width=16) (actual time=3.112..3.278
rows=1000 loops=1)
Sort Key: a6.id
Sort Method: quicksort
Memory: 71kB
-> Merge Left Join
(cost=317.01..711.38 rows=25538 width=16) (actual time=1.386..2.653
rows=1000 loops=1)
Merge Cond: (a5.id
= a6.id)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.697..0.870 rows=1000
loops=1)
Sort Key:
a5.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan
on a a5 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.026..0.303
rows=1000 loops=1)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.681..0.851 rows=1000
loops=1)
Sort Key:
a6.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan
on a a6 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.018..0.286
rows=1000 loops=1)
-> Materialize (cost=5047589.24..5143782.35
rows=19238623 width=104) (actual time=163.351..175.058 rows=100900 loops=1)
-> Sort (cost=5047589.24..5095685.80
rows=19238623 width=104) (actual time=163.343..167.508 rows=100900 loops=1)
Sort Key: a3.id
Sort Method: quicksort Memory: 16766kB
-> Merge Right Join
(cost=322881.15..615728.13 rows=19238623 width=104) (actual
time=119.338..144.270 rows=100900 loops=1)
Merge Cond: (a3.id = a2.id)
-> Sort (cost=158.51..164.16
rows=2260 width=40) (actual time=1.153..1.230 rows=1000 loops=1)
Sort Key: a3.id
Sort Method: quicksort
Memory: 103kB
-> Seq Scan on a a3
(cost=0.00..32.60 rows=2260 width=40) (actual time=0.039..0.629 rows=1000
loops=1)
-> Materialize
(cost=322722.64..331235.31 rows=1702533 width=72) (actual
time=118.179..131.501 rows=100900 loops=1)
-> Sort
(cost=322722.64..326978.97 rows=1702533 width=72) (actual
time=118.173..122.691 rows=100900 loops=1)
Sort Key: a2.id
Sort Method: quicksort
Memory: 16780kB
-> Nested Loop Left
Join (cost=0.00..76684.85 rows=1702533 width=72) (actual
time=1.331..100.252 rows=100900 loops=1)
Join Filter: (a1.id
> 900)
Rows Removed by
Join Filter: 900000
-> Seq Scan on a
a1 (cost=0.00..32.60 rows=2260 width=40) (actual time=0.019..0.292
rows=1000 loops=1)
-> Materialize
(cost=0.00..43.90 rows=2260 width=40) (actual time=0.000..0.047 rows=1000
loops=1000)
-> Seq Scan
on a a2 (cost=0.00..32.60 rows=2260 width=40) (actual time=0.016..0.606
rows=1000 loops=1)
Planning Time: 2.234 ms
Execution Time: 865.635 ms
(64 rows)

Now run the transaction again, but this time, run analyze in separate DB
connection when the transactions sleeps at select pg_sleep(10).
The query plan is as follows and the select query takes 81s to complete. As
you can see the planner greatly underestimated number of rows (1 vs 100900)
and chosen nested loop plan which took a long time to finish.

QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=40.12..40.13 rows=1 width=8) (actual
time=81482.190..81482.190 rows=1 loops=1)
-> Unique (cost=40.10..40.11 rows=1 width=128) (actual
time=81418.552..81477.211 rows=100900 loops=1)
-> Sort (cost=40.10..40.10 rows=1 width=128) (actual
time=81418.551..81440.453 rows=100900 loops=1)
Sort Key: a1.*, a2.*, a3.*, a4.*
Sort Method: external merge Disk: 12376kB
-> Nested Loop Left Join (cost=0.00..40.09 rows=1
width=128) (actual time=1.424..81324.707 rows=100900 loops=1)
Join Filter: (a8.id = a7.id)
Rows Removed by Join Filter: 100800000
-> Nested Loop Left Join (cost=0.00..35.08 rows=1
width=136) (actual time=1.271..70436.483 rows=100900 loops=1)
Join Filter: (a7.id = a6.id)
Rows Removed by Join Filter: 100800000
-> Nested Loop Left Join (cost=0.00..30.06
rows=1 width=136) (actual time=1.117..59388.674 rows=100900 loops=1)
Join Filter: (a6.id = a5.id)
Rows Removed by Join Filter: 100800000
-> Nested Loop Left Join
(cost=0.00..25.05 rows=1 width=136) (actual time=0.962..48361.409
rows=100900 loops=1)
Join Filter: (a5.id = a4.id)
Rows Removed by Join Filter:
100800000
-> Nested Loop Left Join
(cost=0.00..20.04 rows=1 width=136) (actual time=0.807..37348.411
rows=100900 loops=1)
Join Filter: (a3.id = a4.id)
Rows Removed by Join Filter:
100800000
-> Nested Loop Left Join
(cost=0.00..15.02 rows=1 width=104) (actual time=0.544..18687.670
rows=100900 loops=1)
Join Filter: (a2.id =
a3.id)
Rows Removed by Join
Filter: 100800000
-> Nested Loop Left
Join (cost=0.00..10.01 rows=1 width=72) (actual time=0.283..190.095
rows=100900 loops=1)
Join Filter: (a1.id
> 900)
Rows Removed by
Join Filter: 900000
-> Seq Scan on a
a1 (cost=0.00..5.00 rows=1 width=40) (actual time=0.013..0.352 rows=1000
loops=1)
-> Seq Scan on a
a2 (cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.138 rows=1000
loops=1000)
-> Seq Scan on a a3
(cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.140 rows=1000
loops=100900)
-> Seq Scan on a a4
(cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.141 rows=1000
loops=100900)
-> Seq Scan on a a5
(cost=0.00..5.00 rows=1 width=8) (actual time=0.003..0.057 rows=1000
loops=100900)
-> Seq Scan on a a6 (cost=0.00..5.00
rows=1 width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
-> Seq Scan on a a7 (cost=0.00..5.00 rows=1
width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
-> Seq Scan on a a8 (cost=0.00..5.00 rows=1 width=8)
(actual time=0.003..0.057 rows=1000 loops=100900)
Planning Time: 0.522 ms
Execution Time: 81483.797 ms
(36 rows)

What do you think, is this intended behavior or a potential bug ?

Regards, Martin
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik(at)redbyte(dot)eu, www.redbyte.eu <http://redbyte.eu>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Lewis 2019-04-25 15:25:34 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Previous Message Alvaro Herrera 2019-04-25 13:11:57 Re: Does "ON UPDATE" for foreign keys require index?