BUG #18126: Ordered set aggregate: result does not depend on sort order

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: wollhofen(at)unisoftwareplus(dot)com
Subject: BUG #18126: Ordered set aggregate: result does not depend on sort order
Date: 2023-09-21 10:57:26
Message-ID: 18126-b4b5de7206c972e3@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18126
Logged by: Richard Wollhofen
Email address: wollhofen(at)unisoftwareplus(dot)com
PostgreSQL version: 16.0
Operating system: Debian 16.0-1.pgdg120+1
Description:

Probably related to bug report #18125.
Using a custom ordered set aggregate, the sort order in the ORDER BY after
the WITHIN GROUP does not show an influence on the results, as demonstrated
in the script below. There are three possible ways to achieve the same
ascending or descending sort order in the test data, respectively. However,
the result depends on WHAT is used for sorting (columns, whole row) instead
of the actual ordered result set.
Tested for postgres major versions 10 - 16.

Link to test script
https://dbfiddle.uk/2G7NgXPn

-- custom md5 aggregate sfunc
CREATE FUNCTION md5_agg_sfunc(text, anyelement)
RETURNS text
LANGUAGE sql
AS
$$
SELECT md5($1 || $2::text)
$$;

-- custom md5 aggregate function
CREATE AGGREGATE md5_agg (ORDER BY anyelement)
(
STYPE = text,
SFUNC = md5_agg_sfunc,
INITCOND = ''
);

-- test data, two correlated columns
CREATE TABLE t_test (a int, b int);
INSERT INTO t_test
SELECT x, x + 10
FROM generate_series(1, 100000) AS x;

-- Result for ASC sort order is the same for all three criteria!
SELECT * FROM t_test ORDER BY t_test ASC LIMIT 3;
SELECT * FROM t_test ORDER BY t_test.a ASC LIMIT 3;
SELECT * FROM t_test ORDER BY t_test.b ASC LIMIT 3;

-- Result for DESC sort order is the same for all three criteria!
SELECT * FROM t_test ORDER BY t_test DESC LIMIT 3;
SELECT * FROM t_test ORDER BY t_test.a DESC LIMIT 3;
SELECT * FROM t_test ORDER BY t_test.b DESC LIMIT 3;

-- expected 2 hashes, hash H1 for columns 1-3 for ASC order,
-- hash H2 for columns 4-6 for DESC order
SELECT md5_agg() WITHIN GROUP (ORDER BY t_test ASC),
md5_agg() WITHIN GROUP (ORDER BY t_test.a ASC),
md5_agg() WITHIN GROUP (ORDER BY t_test.b ASC),
md5_agg() WITHIN GROUP (ORDER BY t_test DESC),
md5_agg() WITHIN GROUP (ORDER BY t_test.a DESC),
md5_agg() WITHIN GROUP (ORDER BY t_test.b DESC)
FROM t_test;
-- received 3 hashes:
-- H3 for col. 1, 4
-- H4 for col. 2, 5
-- H5 for col. 3, 6
-- Sort order has no influence.
-- Specification of order criterion (whole record or different columns)
determines the result.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2023-09-21 11:00:01 BUG #18127: Assertion HaveRegisteredOrActiveSnapshot failed on REINDEX CONCURRENTLY when blocksize=1
Previous Message PG Bug reporting form 2023-09-21 10:30:44 BUG #18125: Ordered set aggregate results (MD5 hashing) vary between postgres versions