Quick Links

Re: Parallel Query

From:	Luís Roberto Weck <luisroberto(at)siscobra(dot)com(dot)br>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: Parallel Query
Date:	2019-11-13 21:07:16
Message-ID:	d5a6a219-48c7-6da3-f96f-afac9cd38a1a@siscobra.com.br
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

****
Em 13/11/2019 17:40, Jeff Janes escreveu:
> On Wed, Nov 13, 2019 at 3:11 PM Luís Roberto Weck
> <luisroberto(at)siscobra(dot)com(dot)br <mailto:luisroberto(at)siscobra(dot)com(dot)br>> wrote:
>
> Hi!
>
> Is there a reason query 3 can't use parallel workers? Using q1 and q2
> they seem very similar but can use up to 4 workers to run faster:
>
> q1: https://pastebin.com/ufkbSmfB
> q2: https://pastebin.com/Yt32zRNX
> q3: https://pastebin.com/dqh7yKPb
>
> The sort node on q3 takes almost 12 seconds, making the query run
> on 68
> if I had set enough work_mem to make it all in memory.
>
>
> The third one thinks it is going find 3454539 output rows. If it run
> in parallel, it thinks it will be passing lots of rows up from the
> parallel workers, and charges a high price (parallel_tuple_cost = 0.1)
> for doing so. So you can try lowering parallel_tuple_cost, or
> figuring out why the estimate is so bad.
>
> Cheers,
>
> Jeff
Hi Jeff,

I don't think the "HAVING" clause is havin any effect on the estimates:

WITHOUT "HAVING":
Group (cost=1245134.08..1279680.28 rows=3454620 width=14)
Group Key: c.concod, cp.conparnum, cp.conpardatven
-> Sort (cost=1245134.08..1253770.63 rows=3454620 width=14)
        Sort Key: c.concod, cp.conparnum, cp.conpardatven
        -> Hash Join (cost=34366.64..869960.70 rows=3454620 width=14)
              Hash Cond: (cp.concod = c.concod)
              -> Seq Scan on contrato_parcela cp (cost=0.00..804248.08
rows=11941308 width=14)
              -> Hash (cost=23436.55..23436.55 rows=874407 width=8)
                    -> Index Only Scan using contrato_iu0004 on
contrato c (cost=0.43..23436.55 rows=874407 width=8)
                          Index Cond: (carcod = 100)

WITH "HAVING":
GroupAggregate (cost=1245144.88..1322874.51 rows=3454650 width=14)
Group Key: c.concod, cp.conparnum, cp.conpardatven
Filter: (count(*) > 1)
-> Sort (cost=1245144.88..1253781.51 rows=3454650 width=14)
        Sort Key: c.concod, cp.conparnum, cp.conpardatven
        -> Hash Join (cost=34366.64..869968.02 rows=3454650 width=14)
              Hash Cond: (cp.concod = c.concod)
              -> Seq Scan on contrato_parcela cp (cost=0.00..804255.13
rows=11941413 width=14)
              -> Hash (cost=23436.55..23436.55 rows=874407 width=8)
                    -> Index Only Scan using contrato_iu0004 on
contrato c (cost=0.43..23436.55 rows=874407 width=8)
                          Index Cond: (carcod = 100)

Maybe PostgreSQL can't find a way to calculate having estimates?

In response to

Re: Parallel Query at 2019-11-13 20:40:17 from Jeff Janes

Responses

Re: Parallel Query at 2019-11-13 22:21:26 from Jeff Janes

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jeff Janes	2019-11-13 22:08:15	Re: Parallel Query
Previous Message	Luís Roberto Weck	2019-11-13 21:04:42	Re: Parallel Query