8.3 planner handling of IS NULL in aggregations

From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: 8.3 planner handling of IS NULL in aggregations
Date: 2008-07-04 14:37:12
Message-ID: 20080704143712.GY2572@frubble.xen.chris-lamb.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I've just noticed that the planner in 8.3.3 doesn't seem to realize the
difference in the result of the following:

SELECT col, COUNT(*)
FROM tbl
GROUP BY col;

and

SELECT col IS NULL, COUNT(*)
FROM tbl
GROUP BY col IS NULL;

For a table with several million distinct values in "col" this
makes quite a difference. I'd expect to be getting in memory hash
aggregations, but I'm getting a sort step in there instead. Here's an
example:

SELECT col1 IS NOT NULL, col2 IS NOT NULL, col3 IS NOT NULL,
COUNT(*)
FROM tbl
GROUP BY 1,2,3
ORDER BY 1,2,3;

gives the following plan:

GroupAggregate (cost=5018623.99..5387423.18 rows=4338999 width=12)
-> Sort (cost=5018623.99..5081536.33 rows=25164936 width=12)
Sort Key: ((col1 IS NOT NULL)), ((col2 IS NOT NULL)), ((col3 IS NOT NULL))
-> Seq Scan on tbl (cost=0.00..376989.36 rows=25164936 width=12)

I can't see any way for it to produce more than 8 rows of output and so
I'd expect a hash aggregate to be best. Removing the IS NOT NULLs from
the expression gives basically the same plan and expected number of rows
which then looks reasonable.

Sam

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2008-07-04 14:39:35 Re: simple installation problem in windows system
Previous Message Raymond O'Donnell 2008-07-04 14:36:42 Re: simple installation problem in windows system