From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: EXPLAIN ANALYZE for parallel query doesn't report the SortMethod information. |
Date: | 2016-07-07 14:35:02 |
Message-ID: | CA+Tgmobx=+QOOnGwP8p4c6jL0CQVLXsPq5pWvCwaZsGawNySXQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 7, 2016 at 10:07 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>> On Thu, Jul 7, 2016 at 1:23 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> I found $SUBJECT while trying to test parallel queries. Is this a bug?
>
> Presumably the instrumentation data needed for that is not getting
> returned from the worker to the leader.
Yes.
> I would bet there's a lot
> of other plan-node-specific data that doesn't work either.
That's probably true, too. Generally, what's going to happen here is
that if you have a true parallel query plan, any of this sort of
subsidiary information is going to reflect what the leader did, but
not what the workers did. If the leader did nothing, as in the case
of force_parallel_mode, then EXPLAIN ANALYZE will show the same thing
that it would have shown if that node had never executed.
>> I think this can never happen for force_parallel_mode TO off, because
>> we don't generate a gather on top of sort node. The reason why we are
>> able to push Sort below gather, because it is marked as parallel_safe
>> (create_sort_path). I think we should not mark it as parallel_safe.
>
> That seems rather ridiculous. An oversight in managing EXPLAIN data
> is not a sufficient reason to cripple parallel query.
+1.
Fixing this is actually somewhat difficult. The parallel query stuff
does handle propagating the common instrumentation information from
the leader to the workers, but the EXPLAIN ANALYZE output can depend
in arbitrary ways on the final executor state tree, which is, of
course, unshared, and which is also not something we can propagate
between backends since executor state nodes don't have (and can't
really support) serialization and deserialization functions. I think
we can eventually fix this by teaching individual nodes to store the
relevant information in dynamic shared memory rather than
backend-local memory when parallel query is in use: the
Estimate/InitializeDSM callbacks already give the nodes a chance to
obtain control in the right places, except that right now they're only
invoked for parallel-aware nodes. I think, though, that it will take
more development than we want to undrertake at this point in the
cycle.
I'm not sure about the rest of you, but I'd kind of like to finish
this release and start working on the next one.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2016-07-07 14:37:15 | Re: Reviewing freeze map code |
Previous Message | Amit Kapila | 2016-07-07 14:15:34 | Re: EXPLAIN ANALYZE for parallel query doesn't report the SortMethod information. |