Quick Links

Re: Parallel Query

From:	Luís Roberto Weck <luisroberto(at)siscobra(dot)com(dot)br>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: Parallel Query
Date:	2019-11-13 21:04:42
Message-ID:	81c26220-3897-2d87-fb40-35050698f0a1@siscobra.com.br
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

****
Em 13/11/2019 17:47, Tomas Vondra escreveu:
> On Wed, Nov 13, 2019 at 05:16:44PM -0300, Luís Roberto Weck wrote:
>> Hi!
>>
>> Is there a reason query 3 can't use parallel workers? Using q1 and q2
>> they seem very similar but can use up to 4 workers to run faster:
>>
>> q1: https://pastebin.com/ufkbSmfB
>> q2: https://pastebin.com/Yt32zRNX
>> q3: https://pastebin.com/dqh7yKPb
>>
>> The sort node on q3 takes almost 12 seconds, making the query run on
>> 68 if I had set enough work_mem to make it all in memory.
>>
>
> Most likely because it'd be actually slower. The trouble is the
> aggregation does not actually reduce the cardinality, or at least the
> planner does not expect that - the Sort and GroupAggregate are expected
> to produce 3454539 rows. The last step of the aggregation has to receive
> and merge data from all workers, which is not exactly free, and if there
> is no reduction of cardinality it's likely cheaper to just do everything
> in a single process serially.
>
> How does the explain analyze output look like without the HAVING clause?
>
> Try setting parallel_setup_cost and parallel_tuple_cost to 0. That might
> trigger parallel query.
>
> regards
>
Tomas,

EXPLAIN:
Group (cost=1245130.37..1279676.46 rows=3454609 width=14)
Group Key: c.concod, cp.conparnum, cp.conpardatven
-> Sort (cost=1245130.37..1253766.89 rows=3454609 width=14)
        Sort Key: c.concod, cp.conparnum, cp.conpardatven
        -> Hash Join (cost=34366.64..869958.26 rows=3454609 width=14)
              Hash Cond: (cp.concod = c.concod)
              -> Seq Scan on contrato_parcela cp (cost=0.00..804245.73
rows=11941273 width=14)
              -> Hash (cost=23436.55..23436.55 rows=874407 width=8)
                    -> Index Only Scan using contrato_iu0004 on
contrato c (cost=0.43..23436.55 rows=874407 width=8)
                          Index Cond: (carcod = 100)

EXPLAIN ANALYZE:

Group (cost=1245132.29..1279678.44 rows=3454615 width=14) (actual
time=61860.985..64852.579 rows=6787445 loops=1)
Group Key: c.concod, cp.conparnum, cp.conpardatven
-> Sort (cost=1245132.29..1253768.83 rows=3454615 width=14) (actual
time=61860.980..63128.557 rows=6787531 loops=1)
        Sort Key: c.concod, cp.conparnum, cp.conpardatven
        Sort Method: external merge Disk: 172688kB
        -> Hash Join (cost=34366.64..869959.48 rows=3454615 width=14)
(actual time=876.428..52675.140 rows=6787531 loops=1)
              Hash Cond: (cp.concod = c.concod)
              -> Seq Scan on contrato_parcela cp (cost=0.00..804246.91
rows=11941291 width=14) (actual time=0.010..44860.242 rows=11962505 loops=1)
              -> Hash (cost=23436.55..23436.55 rows=874407 width=8)
(actual time=874.791..874.791 rows=879841 loops=1)
                    Buckets: 1048576 Batches: 1 Memory Usage: 42561kB
                    -> Index Only Scan using contrato_iu0004 on
contrato c (cost=0.43..23436.55 rows=874407 width=8) (actual
time=0.036..535.897 rows=879841 loops=1)
                          Index Cond: (carcod = 100)
                          Heap Fetches: 144438
Planning time: 1.252 ms
Execution time: 65214.007 ms

Indeed, reducing the costs made the query run in parallel, but the
improvement in speed was not worth the cost (CPU).

In response to

Re: Parallel Query at 2019-11-13 20:47:37 from Tomas Vondra

Responses

Re: Parallel Query at 2019-11-13 22:08:15 from Jeff Janes

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Luís Roberto Weck	2019-11-13 21:07:16	Re: Parallel Query
Previous Message	Tomas Vondra	2019-11-13 20:47:37	Re: Parallel Query