[PATCH] Fix jsonb comparison for raw scalar pseudo arrays

From: Yan Chengpeng <chengpeng_yan(at)outlook(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] Fix jsonb comparison for raw scalar pseudo arrays
Date: 2024-11-18 14:25:31
Message-ID: OSBPR01MB45199DD8DA2D1CECD50518188E272@OSBPR01MB4519.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear PostgreSQL Hackers,

Problem Description

I encountered an issue with the B-Tree ordering of `jsonb` values. According to the PostgreSQL documentation[1], the ordering should follow this precedence:

`Object > Array > Boolean > Number > String > Null`

However, empty arrays (`[]`) are currently considered smaller than `null`, which violates the documented rules. This occurs due to improper handling of the `rawScalar` flag when comparing arrays in the `compareJsonbContainers()` function in `src/backend/utils/adt/jsonb_util.c`.

Example to Reproduce the Issue

```sql
postgres=# -- Create a test table with a jsonb column
CREATE TABLE jsonb_test (j jsonb PRIMARY KEY);

-- Insert various jsonb values
INSERT INTO jsonb_test VALUES ('null');
INSERT INTO jsonb_test VALUES ('true');
INSERT INTO jsonb_test VALUES ('false');
INSERT INTO jsonb_test VALUES ('0');
INSERT INTO jsonb_test VALUES ('1');
INSERT INTO jsonb_test VALUES ('"string"');
INSERT INTO jsonb_test VALUES ('[]');
INSERT INTO jsonb_test VALUES ('[1, 2, 3]');
INSERT INTO jsonb_test VALUES ('{}');
INSERT INTO jsonb_test VALUES ('{"a": 1}');

-- Query the table to check ordering
SELECT * FROM jsonb_test ORDER BY j;
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
j
-----------
[]
null
"string"
0
1
false
true
[1, 2, 3]
{}
{"a": 1}
(10 rows)
```
The empty array ([]) is incorrectly placed before null.

Analysis

The issue stems from how the rawScalar flag is evaluated in the compareJsonbContainers() function. When comparing arrays, the function does not prioritize the rawScalar flag before comparing the number of elements (nElems), leading to incorrect ordering for arrays treated as “raw scalars.”

Proposed Fix

The proposed fix ensures the rawScalar flag is checked first, and only when both values have the same flag, the number of elements is compared. This guarantees correct ordering of arrays and scalar values. The details are in the attached patch.

Testing

I have added new test cases to validate the B-Tree ordering of various `jsonb` values, including:
1. Scalars (null, true, false, numbers, strings).
2. Arrays (empty, single-element, nested).
3. Objects (empty, single-key, nested).
4. Mixed types (arrays containing objects, scalars, etc.).

The test cases are included in jsonb.sql and the corresponding expected output file.

I have run make check, and all tests pass successfully.

I would appreciate feedback on the proposed solution or suggestions for improvement.

References

[1] JSON Types - https://www.postgresql.org/docs/current/datatype-json.html

Best regards,
Chengpeng(Jack) Yan

Attachment Content-Type Size
v1-0001-fix-jsonb-compare.patch application/octet-stream 5.5 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-11-18 15:06:36 Re: FW: Building Postgres 17.0 with meson
Previous Message Peter Eisentraut 2024-11-18 14:23:27 Re: NOT ENFORCED constraint feature