Re: Planning time in explain/explain analyze

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Stephen Frost <sfrost(at)snowman(dot)net>, Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Planning time in explain/explain analyze
Date: 2014-01-28 14:03:51
Message-ID: CA+Tgmob3fWLT0VSS9jXXSy78SLWKnsgAqhhRUeB3F5x+_S8adA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 9, 2014 at 11:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Stark <stark(at)mit(dot)edu> writes:
>> On Thu, Jan 9, 2014 at 9:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> In short then, I think we should just add this to EXPLAIN and be done.
>>> -1 for sticking the info into PlannedStmt or anything like that.
>
>> I'm confused. I thought I was arguing to support your suggestion that
>> the initial planning store the time in the cached plan and explain
>> should output the time the original planning took.
>
> Uh, no, wasn't my suggestion. Doesn't that design imply measuring *every*
> planning cycle, explain or no? I was thinking more of just putting the
> timing calls into explain.c.

The problem is that you really can't do it that way.
ExplainOneQuery() is a good place to add the timing calls in general,
but ExplainExecuteQuery() in prepare.c never calls it; it does
GetCachedPlan(), which ultimately calls pg_plan_query() after about
four levels of indirection, and then passes the resulting plan
directly to ExplainOnePlan(). So if you only add timing calls to
explain.c, you can't handle anything that goes through
ExplainExecuteQuery(), which is confusingly in prepare.c rather than
explain.c.

One reasonably principled solution is to just give up on showing the
plan time for prepared queries altogether. If we don't want to adopt
that approach, then I think the right way to do this is to add a "bool
timing" argument to pg_plan_query(). When set, pg_plan_query()
measures the time before and after calling planner() and stores it in
the PlannedStmt. It could alternatively return it via an out
parameter, but that's not very convenient for prepare.c, which is
planning a *list* of queries and therefore presumably needs planning
time for each one.

I'm not bent on this design; I just don't see another way to do this
that has any conceptual integrity. Having the normal path measure the
time required to call pg_plan_query() and the prepared path measure
the time required to call GetCachedPlan() which may or may not
eventually call pg_plan_query() one or more times doesn't seem like a
good design, particularly if it's motivated solely by a desire to
minimize the code footprint of what's never going to be a very large
patch. The most recent version of the patch tries to finesse this
issue by determining whether GetCachedPlan() actually did anything; I
think the way it does that is likely an abstraction violation that we
don't want to countenance. But even if we're OK with that, it still
munges the planning times of multiple queries into a single number,
while the normal case separates them. It just seems like we're going
to a lot of trouble here to avoid the obvious design, and I'm not sure
why.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2014-01-28 14:12:39 Re: KNN-GiST with recheck
Previous Message salah jubeh 2014-01-28 14:01:39 Re: Add force option to dropdb