From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Yan Chengpeng <chengpeng_yan(at)outlook(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Fix jsonb comparison for raw scalar pseudo arrays |
Date: | 2024-11-29 08:37:55 |
Message-ID: | CACJufxGehME464jfd2_hTQ8_ZYS5jFD8nxfh30tPQHryPBz7QQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 18, 2024 at 10:25 PM Yan Chengpeng
<chengpeng_yan(at)outlook(dot)com> wrote:
>
>
> I encountered an issue with the B-Tree ordering of `jsonb` values. According to the PostgreSQL documentation[1], the ordering should follow this precedence:
>
> `Object > Array > Boolean > Number > String > Null`
>
>
> However, empty arrays (`[]`) are currently considered smaller than `null`, which violates the documented rules. This occurs due to improper handling of the `rawScalar` flag when comparing arrays in the `compareJsonbContainers()` function in `src/backend/utils/adt/jsonb_util.c`.
>
> ```
>
> The empty array ([]) is incorrectly placed before null.
>
>
> Analysis
>
>
> The issue stems from how the rawScalar flag is evaluated in the compareJsonbContainers() function.
> When comparing arrays, the function does not prioritize the rawScalar flag before comparing the number of elements (nElems), leading to incorrect ordering for arrays treated as “raw scalars.”
>
> Proposed Fix
>
> The proposed fix ensures the rawScalar flag is checked first, and only when both values have the same flag, the number of elements is compared.
> This guarantees correct ordering of arrays and scalar values. The details are in the attached patch.
>
per https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING
Object > Array > Boolean > Number > String > Null
JsonbValue->val.array.rawScalar is false (that is the real array)
should be larger than scalar (Boolean, Number, String, Null).
while sorting, rawScalar flag should have more priority than comparing
the number of elements in an array.
if two jsonb, JsonbValue->val.array.rawScalar values are different,
then we don't need to compare val.array.nElems.
so I think you are right.
but I am confused with your comments change.
src5=# select 'a' < 'A' collate "en_US.utf8";
?column?
----------
t
(1 row)
src5=# select 'a' < 'A' collate "C";
?column?
----------
f
(1 row)
docs says:
""Primitive JSON values are compared using the same comparison rules
as for the underlying PostgreSQL data type.
Strings are compared using the default database collation.
""
To make the regress tests stable, you may need to change the regress
test value ("a", "A")
the only corner case is the empty jsonb array [].
so sql test like:
select jsonb '[]' < jsonb 'null';
should enough?
From | Date | Subject | |
---|---|---|---|
Next Message | Robins Tharakan | 2024-11-29 08:42:29 | psql: Add tab completion for ALTER USER RESET |
Previous Message | Kirill Reshke | 2024-11-29 08:37:20 | Re: [PATCH] SVE popcount support |