pgsql: Improve SELECT DISTINCT to consider hash aggregation, as well as

From: tgl(at)postgresql(dot)org (Tom Lane)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Improve SELECT DISTINCT to consider hash aggregation, as well as
Date: 2008-08-05 02:43:18
Message-ID: 20080805024318.7CFBD755315@cvs.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Log Message:
-----------
Improve SELECT DISTINCT to consider hash aggregation, as well as sort/uniq,
as methods for implementing the DISTINCT step. This eliminates the former
performance gap between DISTINCT and GROUP BY, and also makes it possible
to do SELECT DISTINCT on datatypes that only support hashing not sorting.

SELECT DISTINCT ON is still always implemented by sorting; it would take
executor changes to support hashing that, and it's not clear it's worth
the trouble.

This is a release-note-worthy incompatibility from previous PG versions,
since SELECT DISTINCT can no longer be counted on to deliver sorted output
without explicitly saying ORDER BY. (Anyone who can't cope with that
can consider turning off enable_hashagg.)

Several regression test queries needed to have ORDER BY added to preserve
stable output order. I fixed the ones that manifested here, but there
might be some other cases that show up on other platforms.

Modified Files:
--------------
pgsql/src/backend/nodes:
outfuncs.c (r1.329 -> r1.330)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/nodes/outfuncs.c?r1=1.329&r2=1.330)
pgsql/src/backend/optimizer/plan:
planmain.c (r1.108 -> r1.109)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/plan/planmain.c?r1=1.108&r2=1.109)
planner.c (r1.237 -> r1.238)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/plan/planner.c?r1=1.237&r2=1.238)
pgsql/src/backend/parser:
parse_clause.c (r1.173 -> r1.174)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/parser/parse_clause.c?r1=1.173&r2=1.174)
pgsql/src/include/nodes:
relation.h (r1.156 -> r1.157)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/nodes/relation.h?r1=1.156&r2=1.157)
pgsql/src/test/regress/expected:
numerology.out (r1.4 -> r1.5)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/numerology.out?r1=1.4&r2=1.5)
opr_sanity.out (r1.82 -> r1.83)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/opr_sanity.out?r1=1.82&r2=1.83)
select_distinct.out (r1.5 -> r1.6)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/select_distinct.out?r1=1.5&r2=1.6)
pgsql/src/test/regress/input:
misc.source (r1.21 -> r1.22)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/input/misc.source?r1=1.21&r2=1.22)
pgsql/src/test/regress/output:
misc.source (r1.46 -> r1.47)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/output/misc.source?r1=1.46&r2=1.47)
pgsql/src/test/regress/sql:
numerology.sql (r1.4 -> r1.5)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/numerology.sql?r1=1.4&r2=1.5)
opr_sanity.sql (r1.66 -> r1.67)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/opr_sanity.sql?r1=1.66&r2=1.67)
select_distinct.sql (r1.5 -> r1.6)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/select_distinct.sql?r1=1.5&r2=1.6)

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2008-08-05 05:16:08 pgsql: Fix some message style guideline violations in pg_regress, as
Previous Message Tom Lane 2008-08-04 18:03:46 pgsql: Improve CREATE/DROP/RENAME DATABASE so that when failing because