Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

From: Alessandro Aste <alessandro(dot)aste(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
Date: 2018-03-21 16:09:57
Message-ID: CAM9F+O1szk4_sPbzRhwFhTwUyz-H1tixrXGA0gRzR6bZwREA0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi there, we are using postgresql 10.3 and we're facing an issue with a
query. The query (full query below) completes only when:

1 - LIMIT 10 is removed
or
2 - show max_parallel_workers_per_gather is set to 0, so parallel
processing is disabled.

With max_parallel_workers_per_gather set to the default value (8) I'm
not even able to get the query plan.

Notes:

- We're experiencing the issue in any server of ours but I've reproduced
the issue in a fresh restored database with full vacuum/reindex of the
tables.
- We didn't touch any parameter concering the parallel processing, we're
running the defaults:

cmdstaging=# show max_parallel_workers_per_gather ;
max_parallel_workers_per_gather
---------------------------------
8
(1 row)

cmdstaging=# show max_worker_processes ;
max_worker_processes
----------------------
8
(1 row)

cmdstaging=# show max_parallel_workers;
max_parallel_workers
----------------------
8
(1 row)

The query completes only omitting the LIMIT clause or when I disable
parallel processing:

id | vendor_id | gaa | pop_name | pop_status | pop_location |
pop_provider_id | pop_provider | pop_street | pop_city | pop
_postal_code | pop_state | pop_country | pop_country_id
--------+-----------+-----+---------------+------------+--------------+-----------------+--------------+---------------------+----------+----
-------------+-----------+-------------+----------------
684807 | 12346 | | GTT/POP/LON1T | Active | LON1T |
12288 | Telehouse UK | 14 Coriander Avenue | London | E14
2AA | | GB | 219
(1 row)

Time: 4374.759 ms (00:04.375)
cmdstaging=# show max_parallel_workers_per_gather ;
max_parallel_workers_per_gather
---------------------------------
0
(1 row)

Time: 0.097 ms

Otherwise it just keep running for forever.

This is the full query:

SELECT * FROM (
SELECT
seg.circuit_id AS id,
vendor_gtt_pop.vendor_id,
CASE WHEN vendor_gtt_pop.push_to_gaa = 1 THEN 'Y' END as gaa,
pop.gii_circuitid AS pop_name,
cst.label AS pop_status,
seg.a_company_name AS pop_location,
seg.vendor_id AS pop_provider_id,
seg.vendor_name AS pop_provider,
cs.address1 AS pop_street,
cs.city AS pop_city,
cs.postal_code AS pop_postal_code,
cs.state AS pop_state,
csc.code AS pop_country,
cs.country_id AS pop_country_id
FROM (
SELECT c.gii_circuitid, max(so.id) AS service_order_id
FROM service_order so
join circuit c on c.product_id=so.product_id
join master_service_order mso on mso.id=so.master_service_order_id
WHERE NOT (so.ordertype_id = 2 AND so.status <> 999) AND
NOT (so.ordertype_id = 3 AND so.status <> 999) AND
c.status >= 20 AND
c.status not in (160,999) AND
mso.client_id=11615 AND
c.service_description=28 AND
c.status!=160
GROUP BY c.gii_circuitid
) pop
JOIN service_order so ON so.id = pop.service_order_id
left JOIN client_site cs on cs.id=so.a_site_id
left JOIN country csc on csc.id=cs.country_id
JOIN circuit c ON so.product_id=c.product_id
JOIN circuit_status cst ON cst.id=c.status
JOIN (
SELECT c.id AS circuit_id, sg.id AS segment_id,
c.pop_support_vendor_id AS vendor_id,v.name AS vendor_name,
sg.a_company_name
FROM segment sg
JOIN circuit_layout cl ON cl.segment_id = sg.id AND cl.ordinal = 1
JOIN circuit c ON c.id = cl.circuit_id
JOIN vendor v ON v.id = c.pop_support_vendor_id
) seg ON seg.circuit_id = c.id
JOIN vendor_gtt_pop on vendor_gtt_pop.gtt_pop_id = seg.circuit_id
) foo where vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id LIMIT 10

Execution plan with max_parallel_workers_per_gather =0 ,
max_parallel_workers_per_gather =8 and no LIMIT clause :

=== QUERY PLAN WITH max_parallel_workers_per_gather = 0 ===

Limit (cost=433508.77..433508.77 rows=1 width=176) (actual
time=2992.875..2992.875 rows=1 loops=1)
-> Sort (cost=433508.77..433508.77 rows=1 width=176) (actual
time=2992.874..2992.874 rows=1 loops=1)
Sort Key: c_2.gii_circuitid, c_1.id
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=168438.04..433508.76 rows=1 width=176)
(actual time=1747.038..2992.864 rows=1 loops=1)
Join Filter: (c_1.id = vendor_gtt_pop.gtt_pop_id)
Rows Removed by Join Filter: 662
-> Hash Join (cost=168438.04..433486.80 rows=1 width=148)
(actual time=1540.484..2952.522 rows=663 loops=1)
Hash Cond: (c_1.id = c.id)
-> Hash Join (cost=125957.20..388048.64 rows=788618
width=63) (actual time=1335.406..2746.835 rows=1582 loops=1)
Hash Cond: (c_1.pop_support_vendor_id = v.id)
-> Hash Join (cost=125518.15..377653.28
rows=788618 width=40) (actual time=1330.727..2711.206 rows=778564 loops=
1)
Hash Cond: (sg.id = cl.segment_id)
-> Seq Scan on segment sg
(cost=0.00..231189.42 rows=3482542 width=32) (actual time=0.004..472.967
rows=34
83328 loops=1)
-> Hash (cost=115660.43..115660.43
rows=788618 width=16) (actual time=1328.852..1328.852 rows=778564 loops
=1)
Buckets: 1048576 Batches: 1 Memory
Usage: 41653kB
-> Merge Join
(cost=4.59..115660.43 rows=788618 width=16) (actual time=0.034..1179.131
rows=778564 l
oops=1)
Merge Cond: (cl.circuit_id =
c_1.id)
-> Index Only Scan using
circuit_layout_idx on circuit_layout cl (cost=0.43..61739.87 rows=788
618 width=8) (actual time=0.016..348.350 rows=778564 loops=1)
Index Cond: (ordinal = 1)
Heap Fetches: 778564
-> Index Scan using
circuit_id_key on circuit c_1 (cost=0.42..41790.58 rows=909014 width=8) (a
ctual time=0.011..326.429 rows=909014 loops=1)
-> Hash (cost=325.69..325.69 rows=9069
width=27) (actual time=2.753..2.753 rows=9069 loops=1)
Buckets: 16384 Batches: 1 Memory Usage:
669kB
-> Seq Scan on vendor v
(cost=0.00..325.69 rows=9069 width=27) (actual time=0.006..1.489 rows=9069
loops=1
)
-> Hash (cost=42480.82..42480.82 rows=1 width=85)
(actual time=205.073..205.073 rows=882 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 102kB
-> Nested Loop (cost=42469.41..42480.82 rows=1
width=85) (actual time=184.020..204.776 rows=882 loops=1)
Join Filter: (c.status = cst.id)
Time: 3016.688 ms (00:03.017)

=== QUERY PLAN WITH max_parallel_workers_per_gather = 8 and NO LIMIT
clause ===

-
-----------------------------------------------------------------------------------
Sort (cost=253514.80..253514.81 rows=1 width=176) (actual
time=1121.737..1121.737 rows=1 loops=1)
Sort Key: c_2.gii_circuitid, c_1.id
Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=33193.56..253514.79 rows=1 width=176) (actual
time=1121.723..1121.726 rows=1 loops=1)
Hash Cond: (c_1.id = c.id)
-> Gather (cost=1466.02..221787.23 rows=3 width=75) (actual
time=1027.012..1027.015 rows=1 loops=1)
Workers Planned: 5
Workers Launched: 5
-> Hash Join (cost=466.02..220786.93 rows=1 width=75)
(actual time=964.279..1020.204 rows=0 loops=6)
Hash Cond: (c_1.id = vendor_gtt_pop.gtt_pop_id)
-> Hash Join (cost=444.07..219779.19 rows=157724
width=63) (actual time=122.728..1019.934 rows=264 loops=6)
Hash Cond: (c_1.pop_support_vendor_id = v.id)
-> Merge Join (cost=5.02..217348.87
rows=157724 width=40) (actual time=1.089..1007.302 rows=129761 loops=6)
Merge Cond: (cl.circuit_id = c_1.id)
-> Nested Loop (cost=0.86..171314.49
rows=157724 width=32) (actual time=0.056..477.965 rows=129761 loops=6
)
-> Parallel Index Only Scan using
circuit_layout_idx on circuit_layout cl (cost=0.43..55430.93 rows=
157724 width=8) (actual time=0.031..96.328 rows=129761 loops=6)
Index Cond: (ordinal = 1)
Heap Fetches: 163021
-> Index Scan using uniqid on
segment sg (cost=0.43..0.73 rows=1 width=32) (actual time=0.002..0.002
rows=1 loops=778564)
Index Cond: (id =
cl.segment_id)
-> Index Scan using circuit_id_key on
circuit c_1 (cost=0.42..41790.58 rows=909014 width=8) (actual time=0
.018..377.993 rows=908914 loops=6)
-> Hash (cost=325.69..325.69 rows=9069
width=27) (actual time=3.441..3.441 rows=9069 loops=6)
Buckets: 16384 Batches: 1 Memory Usage:
669kB
-> Seq Scan on vendor v
(cost=0.00..325.69 rows=9069 width=27) (actual time=0.013..1.850 rows=9069
loops=6
)
-> Hash (cost=21.91..21.91 rows=3 width=12) (actual
time=0.118..0.118 rows=1 loops=6)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on vendor_gtt_pop
(cost=0.00..21.91 rows=3 width=12) (actual time=0.113..0.113 rows=1 loops=6)
Filter: (vendor_id = 12346)
Rows Removed by Filter: 1192
-> Hash (cost=31727.53..31727.53 rows=1 width=85) (actual
time=94.693..94.693 rows=882 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 102kB
-> Nested Loop (cost=31725.15..31727.53 rows=1 width=85)
(actual time=83.507..94.344 rows=882 loops=1)
-> Nested Loop (cost=31725.02..31727.38 rows=1
width=74) (actual time=83.496..93.575 rows=882 loops=1)
-> Nested Loop Left Join
(cost=31724.59..31726.89 rows=1 width=70) (actual time=83.484..90.885
rows=882 loops=1)
-> Nested Loop Left Join
(cost=31724.45..31726.73 rows=1 width=68) (actual time=83.475..90.067
rows=882 lo
ops=1)
-> Nested Loop
(cost=31724.02..31726.27 rows=1 width=30) (actual time=83.461..87.442
rows=882 loops=
1)
-> GroupAggregate
(cost=31723.60..31723.62 rows=1 width=26) (actual time=83.446..83.812 rows=8
82 loops=1)
Group Key:
c_2.gii_circuitid
-> Sort
(cost=31723.60..31723.60 rows=1 width=26) (actual time=83.441..83.515
rows=884 l
oops=1)
Sort Key:
c_2.gii_circuitid
Sort Method:
quicksort Memory: 94kB
-> Gather
(cost=1000.85..31723.59 rows=1 width=26) (actual time=2.159..81.342 rows
=884 loops=1)
Workers
Planned: 3
Workers
Launched: 2
-> Nested
Loop (cost=0.85..30723.49 rows=1 width=26) (actual time=8.509..75.
845 rows=295 loops=3)
->
Nested Loop (cost=0.42..30722.56 rows=2 width=30) (actual time=8.48
3..74.283 rows=295 loops=3)

-> Parallel Seq Scan on circuit c_2 (cost=0.00..30714.61 rows=3
width=26) (actual time=5.992..72.563 rows=316 loops=3)

Filter: ((status >= 20) AND (status <> ALL ('{160,999}'::int
eger[])) AND (status <> 160) AND (service_description = 28))

Rows Removed by Filter: 302689

-> Index Scan using so_pid_idx on service_order so_1 (cost=0.42.
.2.65 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=948)

Index Cond: (product_id = c_2.product_id)

Filter: (((ordertype_id <> 2) OR (status = 999)) AND ((order
type_id <> 3) OR (status = 999)))

Rows Removed by Filter: 0
->
Index Scan using master_service_order_id_key on master_service_order
mso (cost=0.42..0.46 rows=1 width=4) (actual time=0.005..0.005 rows=1
loops=884)

Index Cond: (id = so_1.master_service_order_id)

Filter: (client_id = 11615)
-> Index Scan using
service_order_id_key on service_order so (cost=0.42..2.64 rows=1 width=12)
(actual time=0.003..0.003 rows=1 loops=882)
Index Cond: (id = (max(
so_1.id)))
-> Index Scan using
client_site_pkey on client_site cs (cost=0.42..0.46 rows=1 width=46)
(actual tim
e=0.003..0.003 rows=1 loops=882)
Index Cond: (id = so.a_site_id)
-> Index Scan using country_pkey on
country csc (cost=0.14..0.16 rows=1 width=6) (actual time=0.001..0.001
rows=1 loops=882)
Index Cond: (id = cs.country_id)
-> Index Scan using circuit_product_id_idx on
circuit c (cost=0.42..0.49 rows=1 width=12) (actual time=0.003..0.
003 rows=1 loops=882)
Index Cond: (product_id = so.product_id)
-> Index Scan using cs_id_c on circuit_status cst
(cost=0.14..0.16 rows=1 width=19) (actual time=0.001..0.001 rows=1 l
oops=882)
Index Cond: (id = c.status)
Planning time: 3.889 ms
Execution time: 1122.357 ms
(68 rows)

Thank you very much for your help,

Kind regards,

Alessandro Aste.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2018-03-21 16:50:23 Re: Re: Foreign Key Validation after Reference Table Ownership Change
Previous Message Battuello, Louis 2018-03-21 15:30:33 RE: Re: Foreign Key Validation after Reference Table Ownership Change