From: | Sam Mason <sam(at)samason(dot)me(dot)uk> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | 8.3 planner handling of IS NULL in aggregations |
Date: | 2008-07-04 14:37:12 |
Message-ID: | 20080704143712.GY2572@frubble.xen.chris-lamb.co.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
I've just noticed that the planner in 8.3.3 doesn't seem to realize the
difference in the result of the following:
SELECT col, COUNT(*)
FROM tbl
GROUP BY col;
and
SELECT col IS NULL, COUNT(*)
FROM tbl
GROUP BY col IS NULL;
For a table with several million distinct values in "col" this
makes quite a difference. I'd expect to be getting in memory hash
aggregations, but I'm getting a sort step in there instead. Here's an
example:
SELECT col1 IS NOT NULL, col2 IS NOT NULL, col3 IS NOT NULL,
COUNT(*)
FROM tbl
GROUP BY 1,2,3
ORDER BY 1,2,3;
gives the following plan:
GroupAggregate (cost=5018623.99..5387423.18 rows=4338999 width=12)
-> Sort (cost=5018623.99..5081536.33 rows=25164936 width=12)
Sort Key: ((col1 IS NOT NULL)), ((col2 IS NOT NULL)), ((col3 IS NOT NULL))
-> Seq Scan on tbl (cost=0.00..376989.36 rows=25164936 width=12)
I can't see any way for it to produce more than 8 rows of output and so
I'd expect a hash aggregate to be best. Removing the IS NOT NULLs from
the expression gives basically the same plan and expected number of rows
which then looks reasonable.
Sam
From | Date | Subject | |
---|---|---|---|
Next Message | Raymond O'Donnell | 2008-07-04 14:39:35 | Re: simple installation problem in windows system |
Previous Message | Raymond O'Donnell | 2008-07-04 14:36:42 | Re: simple installation problem in windows system |