Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hubert Lubaczewski <depesz(at)depesz(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com
Date: 2017-11-29 08:36:57
Message-ID: CAA4eK1JhEd_oBZZHzDAueKvzg5PwAmYes57+Z+nRy9JNu0Y1wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 28, 2017 at 9:37 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Nov 27, 2017 at 6:54 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Anything "below" "Gather"?
>>>
>> I think it is "actual_time * 1" for anything below Gather.
>
> The actual time amounts below Gather show total elapsed time divided
> by loop count, just as they do anywhere else in the plan. The loop
> count tracks the number of times the plan was executed, just as it
> does anywhere else in the plan. So for example if there are 5
> participants which each execute the node once, and the times spent are
> 5 s, 60 s, 60 s, 60 s, and 60 s, you'll get actual time = 49 s, loops
> = 5.
>
> If you want to know the total amount of time spent under the node,
> then you have to multiply the actual time (49 s in this example) by
> the loop count (5 in this example) just as you would for any other
> plan node. However, you have to keep in mind that, for a parallel
> query, the total time spent under the node is not the same as the
> elapsed time. In this example, if all 5 workers started at the same
> time and ran the node continuously without a break, the *elapsed* time
> until they all finished would be 60 s, not 49 s, a value that EXPLAIN
> will report nowhere. But they can also start and stop executing under
> that node repeatedly and they need not all start at the same time,
> making the concept of elapsed time a bit unclear -- earliest start to
> latest finish would be one way, but that will be misleading (perhaps)
> if they spend 5 seconds a piece but start staggered at 4 second
> intervals.
>
> It's really hard to understand what's actually going on with a
> parallel query unless you look at each worker individually, and even
> then sometimes it's not as clear as it could be.
>

I think stats at Gather node itself gives a somewhat right picture.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-11-29 08:40:29 Re: PATCH: pgbench - option to build using ppoll() for larger connection counts
Previous Message Amit Kapila 2017-11-29 08:34:44 Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com