Re: Parallel Query

From: Luís Roberto Weck <luisroberto(at)siscobra(dot)com(dot)br>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Parallel Query
Date: 2019-11-13 21:07:16
Message-ID: d5a6a219-48c7-6da3-f96f-afac9cd38a1a@siscobra.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


****
Em 13/11/2019 17:40, Jeff Janes escreveu:
> On Wed, Nov 13, 2019 at 3:11 PM Luís Roberto Weck
> <luisroberto(at)siscobra(dot)com(dot)br <mailto:luisroberto(at)siscobra(dot)com(dot)br>> wrote:
>
> Hi!
>
> Is there a reason query 3 can't use parallel workers? Using q1 and q2
> they seem very similar but can use up to 4 workers to run faster:
>
> q1: https://pastebin.com/ufkbSmfB
> q2: https://pastebin.com/Yt32zRNX
> q3: https://pastebin.com/dqh7yKPb
>
> The sort node on q3 takes almost 12 seconds, making the query run
> on 68
> if I had set enough work_mem to make it all in memory.
>
>
> The third one thinks it is going find 3454539 output rows.  If it run
> in parallel, it thinks it will be passing lots of rows up from the
> parallel workers, and charges a high price (parallel_tuple_cost = 0.1)
> for doing so.  So you can try lowering parallel_tuple_cost, or
> figuring out why the estimate is so bad.
>
> Cheers,
>
> Jeff
 Hi Jeff,

I don't think the "HAVING" clause is havin any effect on the estimates:

WITHOUT "HAVING":
Group  (cost=1245134.08..1279680.28 rows=3454620 width=14)
  Group Key: c.concod, cp.conparnum, cp.conpardatven
  ->  Sort  (cost=1245134.08..1253770.63 rows=3454620 width=14)
        Sort Key: c.concod, cp.conparnum, cp.conpardatven
        ->  Hash Join  (cost=34366.64..869960.70 rows=3454620 width=14)
              Hash Cond: (cp.concod = c.concod)
              ->  Seq Scan on contrato_parcela cp (cost=0.00..804248.08
rows=11941308 width=14)
              ->  Hash  (cost=23436.55..23436.55 rows=874407 width=8)
                    ->  Index Only Scan using contrato_iu0004 on
contrato c  (cost=0.43..23436.55 rows=874407 width=8)
                          Index Cond: (carcod = 100)

WITH "HAVING":
GroupAggregate  (cost=1245144.88..1322874.51 rows=3454650 width=14)
  Group Key: c.concod, cp.conparnum, cp.conpardatven
  Filter: (count(*) > 1)
  ->  Sort  (cost=1245144.88..1253781.51 rows=3454650 width=14)
        Sort Key: c.concod, cp.conparnum, cp.conpardatven
        ->  Hash Join  (cost=34366.64..869968.02 rows=3454650 width=14)
              Hash Cond: (cp.concod = c.concod)
              ->  Seq Scan on contrato_parcela cp (cost=0.00..804255.13
rows=11941413 width=14)
              ->  Hash  (cost=23436.55..23436.55 rows=874407 width=8)
                    ->  Index Only Scan using contrato_iu0004 on
contrato c  (cost=0.43..23436.55 rows=874407 width=8)
                          Index Cond: (carcod = 100)

Maybe PostgreSQL can't find a way to calculate having estimates?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2019-11-13 22:08:15 Re: Parallel Query
Previous Message Luís Roberto Weck 2019-11-13 21:04:42 Re: Parallel Query