Re: DATE_TRUNC() and GROUP BY?

From: Dave Johansen <davejohansen(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: DATE_TRUNC() and GROUP BY?
Date: 2014-01-02 19:36:34
Message-ID: CAAcYxUfU+zJ=qgUKXa8Ron+Tsdo9a7noUreX4_mpm3JCwBpTWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Jan 2, 2014 at 11:39 AM, Dave Johansen <davejohansen(at)gmail(dot)com>wrote:

> On Fri, Dec 20, 2013 at 10:46 PM, David Rowley <dgrowleyml(at)gmail(dot)com>wrote:
>
>> On Fri, Dec 20, 2013 at 1:35 PM, Dave Johansen <davejohansen(at)gmail(dot)com>wrote:
>>
>>> I just ran into an interesting issue on Postgres 8.4. I have a database
>>> with about 3 months of data and when I do following query:
>>> SELECT DATE_TRUNC('day', time) AS time_t, COUNT(*) FROM mytable GROUP BY
>>> time_t;
>>>
>>> EXPLAIN shows that it's doing a sort and then a GroupAggregate. There
>>> will only be ~90 outputs, so is there a way I can hint/force the planner to
>>> just do a HashAggregate?
>>>
>>> Just to see if it would change the plan, I tried increasing the work_mem
>>> up to 1GB and it still did the same plan.
>>>
>>>
>> PostgreSQL does not really have any stats on the selectivity of
>> date_trunc('day', time) so my guess is that it can only assume that it has
>> the same selectivity as the time column by itself... Which is very untrue
>> in this case.
>> The group aggregate plan is chosen here as PostgreSQL thinks the the hash
>> table is going to end up pretty big and decides that the group aggregate
>> will be the cheaper option.
>>
>> I mocked up your data and on 9.4 I can get the hash aggregate plan to run
>> if I set the n_distinct value to 90 then analyze the table again.. Even if
>> you could do this on 8.4 I'd not recommend it as it will probably cause
>> havoc with other plans around the time column. I did also get the hash
>> aggregate plan to run if I created a functional index on date_trunc('day',
>> time) then ran analyze again. I don't have a copy of 8.4 around to see if
>> the planner will make use of the index in the same way.
>>
>
I just tried this on 8.4 and it won't create the index because DATE_TRUNC()
is not immutable. The exact error is:
ERROR: function in index expression must be marked IMMUTABLE

Any suggestions or other ideas?

Thanks,
Dave

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Johansen 2014-01-02 19:47:42 Re: DATE_TRUNC() and GROUP BY?
Previous Message Dave Johansen 2014-01-02 18:39:53 Re: DATE_TRUNC() and GROUP BY?