Re: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1

From: "nikolai(dot)berkoff" <nikolai(dot)berkoff(at)pm(dot)me>
To: tanghy(dot)fnst(at)fujitsu(dot)com, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1
Date: 2021-11-16 21:29:10
Message-ID: l1nEq7fEqO4HORuUg3851cJpnSTnjxjMmqO8uqkXwqRGsEPguOEFojLPpcPKLXSHKMPy2plxOIPuJeTiYGRb-JAXEmKshKdEdbpcLjJydv4=@pm.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

> https://www.postgresql.org/docs/14/using-explain.html
> The doc says
>

> "the loops value reports the total number of executions of the node, and the
>

> actual time and rows values shown are averages per-execution. This is done
>

> to make the numbers comparable with the way that the cost estimates are
>

> shown."
>

> But I found for parallel operation, the above description maybe not
>

> correct.
>

> For example
>

> postgres=# create table c(id int);
>

> CREATE TABLE
>

> postgres=# insert into c select generate_series(1,1000000);
>

> INSERT 0 1000000
>

> postgres=# explain analyze select count(*) from c;
>

> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>

> Finalize Aggregate (cost=10633.55..10633.56 rows=1 width=8) (actual
>

> time=290.460..290.508 rows=1 loops=1)
>

> -> Gather (cost=10633.33..10633.54 rows=2 width=8) (actual
>

> time=289.605..290.484 rows=3 loops=1)
>

> Workers Planned: 2
>

> Workers Launched: 2
>

> -> Partial Aggregate (cost=9633.33..9633.34 rows=1 width=8)
>

> (actual time=188.336..188.337 rows=1 loops=3)
>

> -> Parallel Seq Scan on c (cost=0.00..8591.67 rows=416667
>

> width=0) (actual time=0.030..140.036 rows=333333 loops=3)
>

> Planning Time: 0.331 ms
>

> Execution Time: 290.607 ms
>

> (8 rows)
>

> postgres=#
>

> According to PG-doc, the "Parallel Seq Scan" node cost 140.036*3=420ms, but
>

> the total cost for this SQL is only 290ms.
>

> Is the output of this explain correct?

Parallel query is explained in https://www.postgresql.org/docs/14/how-parallel-query-works.html and https://www.postgresql.org/docs/14/parallel-plans.html

The docs seem clear to me that as the nodes are executed in parallel then the time execution time is not 140.036*3. The 140.036 value is actual time the Parallel Seq Scan nodes ran for but there were up to 2 running in parallel.

Attachment Content-Type Size
publickey - nikolai.berkoff@pm.me - 0xD189792D.asc application/pgp-keys 1.8 KB

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message tanghy.fnst@fujitsu.com 2021-11-17 05:32:09 RE: "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1
Previous Message PG Doc comments form 2021-11-16 09:37:21 "actual time" in QUERY PLAN for parallel operation when loops is bigger than 1