Re: Do we want a hashset type?

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "jian he" <jian(dot)universality(at)gmail(dot)com>
Cc: "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "Tom Dunstan" <pgsql(at)tomd(dot)cc>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Do we want a hashset type?
Date: 2023-06-27 08:26:44
Message-ID: c94d97cb-825f-42a9-af8b-dff03fe3705f@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 27, 2023, at 04:35, jian he wrote:
> in SQLMultiSets.pdf(previously thread) I found a related explanation
> on page 45, 46.
>
> (CASE WHEN OP1 IS NULL OR OP2 IS NULL THEN NULL ELSE MULTISET ( SELECT
> T1.V FROM UNNEST (OP1) AS T1 (V) INTERSECT SQ SELECT T2.V FROM UNNEST
> (OP2) AS T2 (V) ) END)
>
> CASE WHEN OP1 IS NULL OR OP2 IS NULL THEN NULL ELSE MULTISET ( SELECT
> T1.V FROM UNNEST (OP1) AS T1 (V) UNION SQ SELECT T2.V FROM UNNEST
> (OP2) AS T2 (V) ) END
>
> (CASE WHEN OP1 IS NULL OR OP2 IS NULL THEN NULL ELSE MULTISET ( SELECT
> T1.V FROM UNNEST (OP1) AS T1 (V) EXCEPT SQ SELECT T2.V FROM UNNEST
> (OP2) AS T2 (V) ) END)

Thanks! This was exactly what I was looking for, I knew I've seen it but failed to find it.

Attached is a new incremental patch as well as a full patch, since this is a substantial change:

Align null semantics with SQL:2023 array and multiset standards

* Introduced a new boolean field, null_element, in the int4hashset_t type.

* Rename hashset_count() to hashset_cardinality().

* Rename hashset_merge() to hashset_union().

* Rename hashset_equals() to hashset_eq().

* Rename hashset_neq() to hashset_ne().

* Add hashset_to_sorted_array().

* Handle null semantics to work as in arrays and multisets.

* Update int4hashset_add() to allow creating a new set if none exists.

* Use more portable int32 typedef instead of int32_t.

This also adds a thorough test suite in array-and-multiset-semantics.sql,
which aims to test all relevant combinations of operations and values.

Makefile | 2 +-
README.md | 6 ++--
hashset--0.0.1.sql | 37 +++++++++++---------
hashset-api.c | 208 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
hashset.c | 12 ++++++-
hashset.h | 11 +++---
test/expected/array-and-multiset-semantics.out | 365 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
test/expected/basic.out | 12 +++----
test/expected/reported_bugs.out | 6 ++--
test/expected/strict.out | 114 ------------------------------------------------------------
test/expected/table.out | 8 ++---
test/sql/array-and-multiset-semantics.sql | 232 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
test/sql/basic.sql | 4 +--
test/sql/benchmark.sql | 14 ++++----
test/sql/reported_bugs.sql | 6 ++--
test/sql/strict.sql | 32 -----------------
test/sql/table.sql | 2 +-
17 files changed, 823 insertions(+), 248 deletions(-)

/Joel

Attachment Content-Type Size
hashset-0.0.1-b7e5614-full.patch application/octet-stream 125.5 KB
hashset-0.0.1-b7e5614-incremental.patch application/octet-stream 65.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yuya Watari 2023-06-27 09:11:14 Re: Making empty Bitmapsets always be NULL
Previous Message Masahiko Sawada 2023-06-27 08:20:03 Re: [PoC] Improve dead tuple storage for lazy vacuum