RE: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1

From: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>
To: nikolai(dot)berkoff <nikolai(dot)berkoff(at)pm(dot)me>, "pgsql-docs(at)lists(dot)postgresql(dot)org" <pgsql-docs(at)lists(dot)postgresql(dot)org>
Subject: RE: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1
Date: 2021-11-17 05:32:09
Message-ID: OS0PR01MB6113A1D06C7D6BA427308CC9FB9A9@OS0PR01MB6113.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Wednesday, November 17, 2021 6:29 AM, nikolai.berkoff <nikolai(dot)berkoff(at)pm(dot)me> wrote:
>
> Parallel query is explained in
> https://www.postgresql.org/docs/14/how-parallel-query-works.html and
> https://www.postgresql.org/docs/14/parallel-plans.html
>
> The docs seem clear to me that as the nodes are executed in parallel then the
> time execution time is not 140.036*3. The 140.036 value is actual time the Parallel
> Seq Scan nodes ran for but there were up to 2 running in parallel.

Thanks for your reply.
I read your references but still confused about the 'loops' in parallel query result.

> -> Parallel Seq Scan on c (cost=0.00..8591.67 rows=416667 width=0) (actual time=0.030..140.036 rows=333333 loops=3)
In my previous example, actual row number is 333333*3=1e6(which is correct), so I think the actual time is 140.036*3ms.
Do your think the loops(3) has no meaning for parallel scan node when calculate actual time?

Regards,
Tang

> -----Original Message-----
> From: nikolai.berkoff <nikolai(dot)berkoff(at)pm(dot)me>
> Sent: Wednesday, November 17, 2021 6:29 AM
> To: Tang, Haiying/唐 海英 <tanghy(dot)fnst(at)fujitsu(dot)com>;
> pgsql-docs(at)lists(dot)postgresql(dot)org
> Subject: Re: "actual time" in QUERY PLAN for parallel operation when loops is
> bigger than 1
>
> > https://www.postgresql.org/docs/14/using-explain.html
> > The doc says
> >
>
> > "the loops value reports the total number of executions of the node, and the
> >
>
> > actual time and rows values shown are averages per-execution. This is done
> >
>
> > to make the numbers comparable with the way that the cost estimates are
> >
>
> > shown."
> >
>
> > But I found for parallel operation, the above description maybe not
> >
>
> > correct.
> >
>
> > For example
> >
>
> > postgres=# create table c(id int);
> >
>
> > CREATE TABLE
> >
>
> > postgres=# insert into c select generate_series(1,1000000);
> >
>
> > INSERT 0 1000000
> >
>
> > postgres=# explain analyze select count(*) from c;
> >
>
> > QUERY PLAN
> >
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> ---------------------------------------------------------------------------
> -----------------------------------------------------------
> >
>
> > Finalize Aggregate (cost=10633.55..10633.56 rows=1 width=8) (actual
> >
>
> > time=290.460..290.508 rows=1 loops=1)
> >
>
> > -> Gather (cost=10633.33..10633.54 rows=2 width=8) (actual
> >
>
> > time=289.605..290.484 rows=3 loops=1)
> >
>
> > Workers Planned: 2
> >
>
> > Workers Launched: 2
> >
>
> > -> Partial Aggregate (cost=9633.33..9633.34 rows=1 width=8)
> >
>
> > (actual time=188.336..188.337 rows=1 loops=3)
> >
>
> > -> Parallel Seq Scan on c (cost=0.00..8591.67 rows=416667
> >
>
> > width=0) (actual time=0.030..140.036 rows=333333 loops=3)
> >
>
> > Planning Time: 0.331 ms
> >
>
> > Execution Time: 290.607 ms
> >
>
> > (8 rows)
> >
>
> > postgres=#
> >
>
> > According to PG-doc, the "Parallel Seq Scan" node cost 140.036*3=420ms, but
> >
>
> > the total cost for this SQL is only 290ms.
> >
>
> > Is the output of this explain correct?
>
> Parallel query is explained in
> https://www.postgresql.org/docs/14/how-parallel-query-works.html and
> https://www.postgresql.org/docs/14/parallel-plans.html
>
> The docs seem clear to me that as the nodes are executed in parallel then the
> time execution time is not 140.036*3. The 140.036 value is actual time the Parallel
> Seq Scan nodes ran for but there were up to 2 running in parallel.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Guillaume Lelarge 2021-11-17 07:18:54 Re: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1
Previous Message nikolai.berkoff 2021-11-16 21:29:10 Re: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1