Re: queries with DISTINCT / GROUP BY giving different plans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: queries with DISTINCT / GROUP BY giving different plans
Date: 2013-08-14 18:35:14
Message-ID: 11586.1376505314@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Tomas Vondra" <tv(at)fuzzy(dot)cz> writes:
> I've run into a strange plan difference on 9.1.9 - the first query does
> "DISTINCT" by doing a GROUP BY on the columns (both INT). ...
> Now, this takes ~45 seconds to execute, but after rewriting the query to
> use the regular DISTINCT it suddenly switches to HashAggregate with ~1/3
> the cost (although it produces the same output, AFAIK), and it executes in
> ~15 seconds.

[ scratches head... ] I guess you're running into some corner case where
choose_hashed_grouping and choose_hashed_distinct make different choices.
It's going to be tough to debug without a test case though. I couldn't
reproduce the behavior in a few tries here.

> BTW I can't test this on 9.2 or 9.3 easily, as this is our production
> environment and I can't just export the data. I've tried to simulate this
> but so far no luck.

I suppose they won't yet you step through those two functions with a
debugger either ...

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2013-08-14 19:05:19 Re: Interesting case of IMMUTABLE significantly hurting performance
Previous Message Tomas Vondra 2013-08-14 15:33:53 queries with DISTINCT / GROUP BY giving different plans