Fast HashJoin only after a cluster/recreate table

From: Alexandre de Arruda Paes <adaldeia(at)gmail(dot)com>
To: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Fast HashJoin only after a cluster/recreate table
Date: 2016-04-01 02:44:50
Message-ID: CAGewt-t0izL+zaK6B3YpMDbkBoTU=ppBUY1exnFTVT4J6PV67Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

In the query below, the planner choose an extreme slow mergejoin(380
seconds). 'Vacuum analyze' can't help.
If I CLUSTER (or recreate) table ES09T1, the planner choose a faster
hashjoin (about 10 seconds). But, obviously, I can't do that with the users
connected.
After some time after cluster(generally in the same day), the problem
returns. Autovacuum is on, but the tables are vacuumed forced after
pg_dump, 3 times in a day (00:00 - 12:00 - 23:00).

Postgresql 9.4.5
128GB RAM/10xRAID10 SAS 15k
shared_buffers = 8GB

work_mem = 256MB

maintenance_work_mem = 16GB
random_page_cost = 2.0

effective_cache_size = 120GB

db=# explain (buffers,analyze) SELECT T1.es09item, T1.es09status,
T3.es09usuari, T3.es09datreq, T2.es08desdoc AS es09desdoc, T1.es09numdoc,
T1.es09tipdoc AS es09tipdoc, T1.es09codemp, COALESCE( T4.es09quatre, 0) AS
es09quatre FROM (((ES09T1 T1 LEFT JOIN ES08T T2 ON T2.es08tipdoc =
T1.es09tipdoc) LEFT JOIN ES09T T3 ON T3.es09codemp = T1.es09codemp AND
T3.es09tipdoc = T1.es09tipdoc AND T3.es09numdoc = T1.es09numdoc) LEFT JOIN
(SELECT COUNT(*) AS es09quatre, es09codemp, es09tipdoc, es09numdoc FROM
ES09T1 GROUP BY es09codemp, es09tipdoc, es09numdoc ) T4 ON T4.es09codemp =
T1.es09codemp AND T4.es09tipdoc = T1.es09tipdoc AND T4.es09numdoc =
T1.es09numdoc) WHERE (T1.es09codemp = 1) and (T3.es09datreq >= '2016-02-02'
and T3.es09datreq <= '2016-02-02') and (T3.es09usuari like
'%%%%%%%%%%%%%%%%%%%%') and (T1.es09tipdoc like '%%%%%') ORDER BY
T1.es09codemp, T1.es09numdoc DESC, T1.es09tipdoc;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=289546.93..289546.94 rows=2 width=78) (actual
time=380405.796..380405.929 rows=2408 loops=1)
Sort Key: t1.es09numdoc, t1.es09tipdoc
Sort Method: quicksort Memory: 435kB
Buffers: shared hit=82163
-> Merge Left Join (cost=47.09..289546.92 rows=2 width=78) (actual
time=1133.077..380398.160 rows=2408 loops=1)
Merge Cond: (t1.es09tipdoc = es09t1.es09tipdoc)
Join Filter: ((es09t1.es09codemp = t1.es09codemp) AND
(es09t1.es09numdoc = t1.es09numdoc))
Rows Removed by Join Filter: 992875295
Buffers: shared hit=82163
-> Merge Left Join (cost=46.53..49.29 rows=2 width=70) (actual
time=12.206..18.155 rows=2408 loops=1)
Merge Cond: (t1.es09tipdoc = t2.es08tipdoc)
Buffers: shared hit=6821
-> Sort (cost=9.19..9.19 rows=2 width=44) (actual
time=11.611..12.248 rows=2408 loops=1)
Sort Key: t1.es09tipdoc
Sort Method: quicksort Memory: 285kB
Buffers: shared hit=6814
-> Nested Loop (cost=1.11..9.18 rows=2 width=44)
(actual time=0.040..10.398 rows=2408 loops=1)
Buffers: shared hit=6814
-> Index Scan using ad_es09t_1 on es09t t3
(cost=0.56..4.58 rows=1 width=42) (actual time=0.020..0.687 rows=1212
loops=1)
Index Cond: ((es09codemp = 1) AND
(es09datreq >= '2016-02-02'::date) AND (es09datreq <= '2016-02-02'::date))
Filter: (es09usuari ~~
'%%%%%%%%%%%%%%%%%%%%'::text)
Buffers: shared hit=108
-> Index Scan using es09t1_pkey on es09t1 t1
(cost=0.56..4.59 rows=1 width=19) (actual time=0.006..0.007 rows=2
loops=1212)
Index Cond: ((es09codemp = 1) AND
(es09tipdoc = t3.es09tipdoc) AND (es09numdoc = t3.es09numdoc))
Filter: (es09tipdoc ~~ '%%%%%'::text)
Buffers: shared hit=6706
-> Sort (cost=37.35..38.71 rows=547 width=32) (actual
time=0.592..2.206 rows=2919 loops=1)
Sort Key: t2.es08tipdoc
Sort Method: quicksort Memory: 67kB
Buffers: shared hit=7
-> Seq Scan on es08t t2 (cost=0.00..12.47 rows=547
width=32) (actual time=0.003..0.126 rows=547 loops=1)
Buffers: shared hit=7
-> Materialize (cost=0.56..287644.85 rows=716126 width=23)
(actual time=0.027..68577.800 rows=993087854 loops=1)
Buffers: shared hit=75342
-> GroupAggregate (cost=0.56..278693.28 rows=716126
width=15) (actual time=0.025..4242.453 rows=3607573 loops=1)
Group Key: es09t1.es09codemp, es09t1.es09tipdoc,
es09t1.es09numdoc
Buffers: shared hit=75342
-> Index Only Scan using es09t1_pkey on es09t1
(cost=0.56..199919.49 rows=7161253 width=15) (actual time=0.016..1625.031
rows=7160921 loops=1)
Index Cond: (es09codemp = 1)
Heap Fetches: 51499
Buffers: shared hit=75342
Planning time: 50.129 ms
Execution time: 380419.435 ms
(43 rows)

db=# vacuum ANALYZE es09t1;
VACUUM

db=# explain SELECT T1.es09item, T1.es09status, T3.es09usuari,
T3.es09datreq, T2.es08desdoc AS es09desdoc, T1.es09numdoc, T1.es09tipdoc AS
es09tipdoc, T1.es09codemp, COALESCE( T4.es09quatre, 0) AS es09quatre FROM
(((ES09T1 T1 LEFT
JOIN ES08T T2 ON T2.es08tipdoc = T1.es09tipdoc) LEFT JOIN ES09T T3 ON
T3.es09codemp = T1.es09codemp AND T3.es09tipdoc = T1.es09tipdoc AND
T3.es09numdoc = T1.es09numdoc) LEFT JOIN (SELECT COUNT(*) AS es09quatre,
es09codemp, es09tipdoc, e
s09numdoc FROM ES09T1 GROUP BY es09codemp, es09tipdoc, es09numdoc ) T4 ON
T4.es09codemp = T1.es09codemp AND T4.es09tipdoc = T1.es09tipdoc AND
T4.es09numdoc = T1.es09numdoc) WHERE (T1.es09codemp = 1) and (T3.es09datreq
>= '2016-02-02' and
T3.es09datreq <= '2016-02-02') and (T3.es09usuari like
'%%%%%%%%%%%%%%%%%%%%') and (T1.es09tipdoc like '%%%%%') ORDER BY
T1.es09codemp, T1.es09numdoc DESC, T1.es09tipdoc;

QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=288400.09..288400.09 rows=2 width=78)
Sort Key: t1.es09numdoc, t1.es09tipdoc
-> Merge Left Join (cost=46.22..288400.08 rows=2 width=78)
Merge Cond: (t1.es09tipdoc = es09t1.es09tipdoc)
Join Filter: ((es09t1.es09codemp = t1.es09codemp) AND
(es09t1.es09numdoc = t1.es09numdoc))
-> Merge Left Join (cost=45.66..48.43 rows=2 width=70)
Merge Cond: (t1.es09tipdoc = t2.es08tipdoc)
-> Sort (cost=9.19..9.19 rows=2 width=44)
Sort Key: t1.es09tipdoc
-> Nested Loop (cost=1.11..9.18 rows=2 width=44)
-> Index Scan using ad_es09t_1 on es09t t3
(cost=0.56..4.58 rows=1 width=42)
Index Cond: ((es09codemp = 1) AND
(es09datreq >= '2016-02-02'::date) AND (es09datreq <= '2016-02-02'::date))
Filter: (es09usuari ~~
'%%%%%%%%%%%%%%%%%%%%'::text)
-> Index Scan using es09t1_pkey on es09t1 t1
(cost=0.56..4.59 rows=1 width=19)
Index Cond: ((es09codemp = 1) AND
(es09tipdoc = t3.es09tipdoc) AND (es09numdoc = t3.es09numdoc))
Filter: (es09tipdoc ~~ '%%%%%'::text)
-> Sort (cost=36.47..37.84 rows=549 width=32)
Sort Key: t2.es08tipdoc
-> Seq Scan on es08t t2 (cost=0.00..11.49 rows=549
width=32)
-> Materialize (cost=0.56..286496.26 rows=716037 width=23)
-> GroupAggregate (cost=0.56..277545.79 rows=716037
width=15)
Group Key: es09t1.es09codemp, es09t1.es09tipdoc,
es09t1.es09numdoc
-> Index Only Scan using es09t1_pkey on es09t1
(cost=0.56..198781.81 rows=7160361 width=15)
Index Cond: (es09codemp = 1)
(24 rows)

----------------------------------------------------------------------------

db=# cluster es09t1;
CLUSTER

db=# explain (buffers,analyze) SELECT T1.es09item, T1.es09status,
T3.es09usuari, T3.es09datreq, T2.es08desdoc AS es09desdoc, T1.es09numdoc,
T1.es09tipdoc AS es09tipdoc, T1.es09codemp, COALESCE( T4.es09quatre, 0) AS
es09quatre FROM (((ES09T1 T1 LEFT JOIN ES08T T2 ON T2.es08tipdoc =
T1.es09tipdoc) LEFT JOIN ES09T T3 ON T3.es09codemp = T1.es09codemp AND
T3.es09tipdoc = T1.es09tipdoc AND T3.es09numdoc = T1.es09numdoc) LEFT JOIN
(SELECT COUNT(*) AS es09quatre, es09codemp, es09tipdoc, es09numdoc FROM
ES09T1 GROUP BY es09codemp, es09tipdoc, es09numdoc ) T4 ON T4.es09codemp =
T1.es09codemp AND T4.es09tipdoc = T1.es09tipdoc AND T4.es09numdoc =
T1.es09numdoc) WHERE (T1.es09codemp = 1) and (T3.es09datreq >= '2016-02-02'
and T3.es09datreq <= '2016-02-02') and (T3.es09usuari like
'%%%%%%%%%%%%%%%%%%%%') and (T1.es09tipdoc like '%%%%%') ORDER BY
T1.es09codemp, T1.es09numdoc DESC, T1.es09tipdoc;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=483816.33..483816.34 rows=2 width=78) (actual
time=8794.833..8795.001 rows=2408 loops=1)
Sort Key: t1.es09numdoc, t1.es09tipdoc
Sort Method: quicksort Memory: 435kB
Buffers: shared hit=13649 read=299785
-> Nested Loop Left Join (cost=461417.89..483816.32 rows=2 width=78)
(actual time=6563.106..8790.845 rows=2408 loops=1)
Buffers: shared hit=13649 read=299785
-> Hash Right Join (cost=461417.61..483815.72 rows=2 width=52)
(actual time=6563.082..8782.169 rows=2408 loops=1)
Hash Cond: ((es09t1.es09codemp = t1.es09codemp) AND
(es09t1.es09tipdoc = t1.es09tipdoc) AND (es09t1.es09numdoc = t1.es09numdoc))
Buffers: shared hit=6425 read=299785
-> HashAggregate (cost=461408.40..468575.79 rows=716739
width=15) (actual time=6548.467..7866.944 rows=3607578 loops=1)
Group Key: es09t1.es09codemp, es09t1.es09tipdoc,
es09t1.es09numdoc
Buffers: shared hit=421 read=299566
-> Seq Scan on es09t1 (cost=0.00..389734.56
rows=7167384 width=15) (actual time=2.154..1818.148 rows=7160931 loops=1)
Filter: (es09codemp = 1)
Rows Removed by Filter: 11849
Buffers: shared hit=421 read=299566
-> Hash (cost=9.18..9.18 rows=2 width=44) (actual
time=12.486..12.486 rows=2408 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 188kB
Buffers: shared hit=6004 read=219
-> Nested Loop (cost=1.11..9.18 rows=2 width=44)
(actual time=0.076..11.112 rows=2408 loops=1)
Buffers: shared hit=6004 read=219
-> Index Scan using ad_es09t_1 on es09t t3
(cost=0.56..4.58 rows=1 width=42) (actual time=0.035..0.743 rows=1212
loops=1)
Index Cond: ((es09codemp = 1) AND
(es09datreq >= '2016-02-02'::date) AND (es09datreq <= '2016-02-02'::date))
Filter: (es09usuari ~~
'%%%%%%%%%%%%%%%%%%%%'::text)
Buffers: shared hit=98 read=12
-> Index Scan using es09t1_pkey on es09t1 t1
(cost=0.56..4.59 rows=1 width=19) (actual time=0.007..0.008 rows=2
loops=1212)
Index Cond: ((es09codemp = 1) AND
(es09tipdoc = t3.es09tipdoc) AND (es09numdoc = t3.es09numdoc))
Filter: (es09tipdoc ~~ '%%%%%'::text)
Buffers: shared hit=5906 read=207
-> Index Scan using es08t_pkey on es08t t2 (cost=0.28..0.29
rows=1 width=32) (actual time=0.002..0.003 rows=1 loops=2408)
Index Cond: (es08tipdoc = t1.es09tipdoc)
Buffers: shared hit=7224
Planning time: 14.498 ms
Execution time: 8819.824 ms
(34 rows)

Best regards,

Alexandre

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message jarek 2016-04-01 07:54:03 Re: Big number of connections
Previous Message Karl Denninger 2016-03-31 23:27:07 Re: Big number of connections