Quick Links

Re: Pull up aggregate subquery

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Pull up aggregate subquery
Date:	2011-05-24 17:49:18
Message-ID:	BANLkTimURp+3_LZuaOyP0GSMOeZ7qNNdbw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, May 24, 2011 at 12:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Oh, I see. I have a general gripe with nested loop plans: we already
>> consider too many of them. IIRC, when I last fooled around with this,
>> the number of nested loop paths that we generate far exceeded the
>> number of merge or hash join paths, and most of those paths suck and
>> are a complete waste of time.
>
> Hm, really? My experience is that it's the mergejoin paths that breed
> like rabbits, because there are so many potential sort orders.

*scratches head*

Well, I'm pretty sure that's how it looked when I was testing it. I
wonder how this could be different for the two of us. Or maybe one of
us is confused. Admittedly, I haven't looked at it in a while.

>>> But I think this is all fairly unrelated to the case that Hitoshi is on
>>> about. As you said earlier, it seems like we'd have to derive both
>>> parameterized and unparameterized plans for the subquery, which seems
>>> mighty expensive.
>
>> That was my first thought, too, but then I wondered if I was getting
>> cheap.
>
> Yeah, it's certainly possible that we're worrying too much. Usually
> I only get concerned about added planner logic if it will impact the
> planning time for simple queries. "Simple" tends to be in the eye of
> the beholder, but something with a complicated aggregate subquery is
> probably not simple by anyone's definition.
>
> In this case the sticky point is that there could be multiple possible
> sets of clauses available to be pushed down, depending on what you
> assume is the outer relation for the eventual upper-level nestloop.
> So worst case, you could have not just one parameterized plan to
> generate in addition to the regular kind, but 2^N of them ...

Hmm. Well, 2^N is more than 2. But I bet most of them are boring.
Judging by his followup email, Hitoshi Harada seems to think we can
just look at the case where we can parameterize on all of the grouping
columns. The only other case that seems like it might be interesting
is parameterizing on any single one of the grouping columns. I can't
get excited about pushing down arbitrary subsets. Of course, even
O(N) in the number of grouping columns might be too much, but then we
could fall back to just all or nothing. I think the "all" case by
itself would probably extract 90%+ of the benefit, especially since
"all" will often mean "the only one there is".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Pull up aggregate subquery at 2011-05-24 16:34:05 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2011-05-24 17:56:26	Re: "errno" not set in case of "libm" functions (HPUX)
Previous Message	Josh Berkus	2011-05-24 17:48:01	Re: 9.2 schedule