Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

From: Alessandro Aste <alessandro(dot)aste(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
Date: 2018-03-21 19:44:57
Message-ID: CAM9F+O3AaSUeF43KFvUHxpL+1vf4JB8V8hD2eaC5rKQpNtuCCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for your reply Tomas. The query just got stuck for forever. I
observed no CPU spikes, it is currently running and I see 89 of the CPU
idle. I'm using a relative powerfull machine (120 GB ram, 56 CPU).

Plain analyze as requested. :

QUERY PLAN

------------------------------------------------------------
------------------------------------------------------------
---------------------
-------------------------------------------------------------
Limit (cost=253523.56..253523.57 rows=1 width=176)
-> Sort (cost=253523.56..253523.57 rows=1 width=176)
Sort Key: c_2.gii_circuitid, c_1.id
-> Nested Loop (cost=33190.89..253523.55 rows=1 width=176)
Join Filter: (c_1.id = c.id)
-> Nested Loop (cost=31724.87..31736.29 rows=1 width=85)
Join Filter: (c.status = cst.id)
-> Nested Loop (cost=31724.87..31734.84 rows=1
width=74)
-> Nested Loop Left Join
(cost=31724.45..31734.35 rows=1 width=70)
Join Filter: (csc.id = cs.country_id)
-> Nested Loop Left Join
(cost=31724.45..31726.73 rows=1 width=68)
-> Nested Loop
(cost=31724.02..31726.27 rows=1 width=30)
-> GroupAggregate
(cost=31723.60..31723.62 rows=1 width=26)
Group Key:
c_2.gii_circuitid
-> Sort
(cost=31723.60..31723.60 rows=1 width=26)
Sort Key:
c_2.gii_circuitid
-> Gather
(cost=1000.85..31723.59 rows=1 width=26)
Workers
Planned: 3
-> Nested
Loop (cost=0.85..30723.49 rows=1 width=26)
->
Nested Loop (cost=0.42..30722.56 rows=2 width=30)

-> Parallel Seq Scan on circuit c_2 (cost=0.00..30714.61 rows=3
width=26)

Filter: ((status >= 20) AND (status <> ALL ('{160,999}'::int
eger[])) AND (status <> 160) AND (service_description = 28))

-> Index Scan using so_pid_idx on service_order so_1 (cost=0.42.
.2.65 rows=1 width=12)

Index Cond: (product_id = c_2.product_id)

Filter: (((ordertype_id <> 2) OR (status = 999)) AND ((order
type_id <> 3) OR (status = 999)))
->
Index Scan using master_service_order_id_key on master_service_order
mso (cost=0.42..0.46 rows=1 width=4)

Index Cond: (id = so_1.master_service_order_id)

Filter: (client_id = 11615)
-> Index Scan using
service_order_id_key on service_order so (cost=0.42..2.64 rows=1 width=12)
Index Cond: (id = (max(
so_1.id)))
-> Index Scan using
client_site_pkey on client_site cs (cost=0.42..0.46 rows=1 width=46)
Index Cond: (id = so.a_site_id)
-> Seq Scan on country csc
(cost=0.00..4.50 rows=250 width=6)
-> Index Scan using circuit_product_id_idx on
circuit c (cost=0.42..0.49 rows=1 width=12)
Index Cond: (product_id = so.product_id)
-> Seq Scan on circuit_status cst (cost=0.00..1.20
rows=20 width=19)
-> Gather (cost=1466.02..221787.23 rows=3 width=75)
Workers Planned: 5
-> Hash Join (cost=466.02..220786.93 rows=1 width=75)
Hash Cond: (c_1.id = vendor_gtt_pop.gtt_pop_id)
-> Hash Join (cost=444.07..219779.19
rows=157724 width=63)
Hash Cond: (c_1.pop_support_vendor_id =
v.id)
-> Merge Join (cost=5.02..217348.87
rows=157724 width=40)
Merge Cond: (cl.circuit_id = c_1.id)
-> Nested Loop
(cost=0.86..171314.49 rows=157724 width=32)
-> Parallel Index Only Scan
using circuit_layout_idx on circuit_layout cl (cost=0.43..55430.93
rows=157724 width=8)
Index Cond: (ordinal = 1)
-> Index Scan using uniqid on
segment sg (cost=0.43..0.73 rows=1 width=32)
Index Cond: (id =
cl.segment_id)
-> Index Scan using circuit_id_key
on circuit c_1 (cost=0.42..41790.58 rows=909014 width=8)
-> Hash (cost=325.69..325.69 rows=9069
width=27)
-> Seq Scan on vendor v
(cost=0.00..325.69 rows=9069 width=27)
-> Hash (cost=21.91..21.91 rows=3 width=12)
-> Seq Scan on vendor_gtt_pop
(cost=0.00..21.91 rows=3 width=12)
Filter: (vendor_id = 12346)
(55 rows)

On Wed, Mar 21, 2018 at 8:01 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

>
>
> On 03/21/2018 05:09 PM, Alessandro Aste wrote:
> > Hi there, we are using postgresql 10.3 and we're facing an issue with a
> > query. The query (full query below) completes only when:
> >
> > 1 - LIMIT 10 is removed
> > or
> > 2 - show max_parallel_workers_per_gather is set to 0, so parallel
> > processing is disabled.
> >
> > With max_parallel_workers_per_gather set to the default value (8) I'm
> > not even able to get the query plan.
> >
> > Notes:
> >
> > * We're experiencing the issue in any server of ours but I've
> > reproduced the issue in a fresh restored database with full
> > vacuum/reindex of the tables.
> > * We didn't touch any parameter concering the parallel processing,
> > we're running the defaults:
> >
> >
> > cmdstaging=# show max_parallel_workers_per_gather ;
> > max_parallel_workers_per_gather
> > ---------------------------------
> > 8
> > (1 row)
> >
> > cmdstaging=# show max_worker_processes ;
> > max_worker_processes
> > ----------------------
> > 8
> > (1 row)
> >
> > cmdstaging=# show max_parallel_workers;
> > max_parallel_workers
> > ----------------------
> > 8
> > (1 row)
> >
> >
> >
> >
> > The query completes only omitting the LIMIT clause or when I disable
> > parallel processing:
> >
> > id | vendor_id | gaa | pop_name | pop_status | pop_location |
> > pop_provider_id | pop_provider | pop_street | pop_city | pop
> > _postal_code | pop_state | pop_country | pop_country_id
> > --------+-----------+-----+---------------+------------+----
> ----------+-----------------+--------------+----------------
> -----+----------+----
> > -------------+-----------+-------------+----------------
> > 684807 | 12346 | | GTT/POP/LON1T | Active | LON1T |
> > 12288 | Telehouse UK | 14 Coriander Avenue | London | E14
> > 2AA | | GB | 219
> > (1 row)
> >
> > Time: 4374.759 ms (00:04.375)
> > cmdstaging=# show max_parallel_workers_per_gather ;
> > max_parallel_workers_per_gather
> > ---------------------------------
> > 0
> > (1 row)
> >
> > Time: 0.097 ms
> >
> >
> > Otherwise it just keep running for forever.
> >
>
> When you say "running forever" is it actually using CPU, or does it get
> stuck on something?
>
> >
> > This is the full query:
> >
> >
> > SELECT * FROM (
> > SELECT
> > seg.circuit_id AS id,
> > vendor_gtt_pop.vendor_id,
> > CASE WHEN vendor_gtt_pop.push_to_gaa = 1 THEN 'Y' END as gaa,
> > pop.gii_circuitid AS pop_name,
> > cst.label AS pop_status,
> > seg.a_company_name AS pop_location,
> > seg.vendor_id AS pop_provider_id,
> > seg.vendor_name AS pop_provider,
> > cs.address1 AS pop_street,
> > cs.city AS pop_city,
> > cs.postal_code AS pop_postal_code,
> > cs.state AS pop_state,
> > csc.code AS pop_country,
> > cs.country_id AS pop_country_id
> > FROM (
> > SELECT c.gii_circuitid, max(so.id <http://so.id>) AS service_order_id
> > FROM service_order so
> > join circuit c on c.product_id=so.product_id
> > join master_service_order mso on mso.id <http://mso.id>=so.master_serv
> ice_order_id
> > WHERE NOT (so.ordertype_id = 2 AND so.status <> 999) AND
> > NOT (so.ordertype_id = 3 AND so.status <> 999) AND
> > c.status >= 20 AND
> > c.status not in (160,999) AND
> > mso.client_id=11615 AND
> > c.service_description=28 AND
> > c.status!=160
> > GROUP BY c.gii_circuitid
> > ) pop
> > JOIN service_order so ON so.id <http://so.id> = pop.service_order_id
> > left JOIN client_site cs on cs.id <http://cs.id>=so.a_site_id
> > left JOIN country csc on csc.id <http://csc.id>=cs.country_id
> > JOIN circuit c ON so.product_id=c.product_id
> > JOIN circuit_status cst ON cst.id <http://cst.id>=c.status
> > JOIN (
> > SELECT c.id <http://c.id> AS circuit_id, sg.id <http://sg.id> AS
> segment_id, c.pop_support_vendor_id AS vendor_id,
> > v.name <http://v.name> AS vendor_name, sg.a_company_name
> > FROM segment sg
> > JOIN circuit_layout cl ON cl.segment_id = sg.id <http://sg.id> AND
> cl.ordinal = 1
> > JOIN circuit c ON c.id <http://c.id> = cl.circuit_id
> > JOIN vendor v ON v.id <http://v.id> = c.pop_support_vendor_id
> > ) seg ON seg.circuit_id = c.id <http://c.id>
> > JOIN vendor_gtt_pop on vendor_gtt_pop.gtt_pop_id = seg.circuit_id
> > ) foo where vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id LIMIT 10
> >
> >
> >
> > Execution plan with max_parallel_workers_per_gather =0 ,
> > max_parallel_workers_per_gather =8 and no LIMIT clause :
> >
> >
> We really need to see the execution plan that causes issues, i.e.
> max_parallel_workers_per_gather=8 with LIMIT clause. Plain explain
> (without analyze), at least.
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message chris 2018-03-21 20:16:02 Re: JDBC connectivity issue
Previous Message Tomas Vondra 2018-03-21 19:01:26 Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on