| From: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> | 
|---|---|
| To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> | 
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Extract numeric filed in JSONB more effectively | 
| Date: | 2023-08-02 01:05:44 | 
| Message-ID: | CAKU4AWpSGSxjKVq_S=1pfN5UxbBEd_H9ZvBUQ741EXWpdNW1=Q@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi Matthias:
On Wed, Aug 2, 2023 at 7:33 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
>
> On Tue, Aug 1, 2023 at 7:03 PM Matthias van de Meent <
> boekewurm+postgres(at)gmail(dot)com> wrote:
>
>> On Tue, 1 Aug 2023 at 06:39, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>> >
>> > Hi:
>> >
>> > Currently if we want to extract a numeric field in jsonb, we need to use
>> > the following expression:  cast (a->>'a' as numeric). It will turn a
>> numeric
>> > to text first and then turn the text to numeric again.
>>
>> Why wouldn't you use cast(a->'a' as numeric), or ((a->'a')::numeric)?
>>
>
> Thanks for this information! I didn't realize we have this function
> already at [1].
>
> https://www.postgresql.org/docs/15/functions-json.html
>
Hi:
I just found ((a->'a')::numeric) is not as effective as I expected.
First in the above expression we used jsonb_object_field which
returns a jsonb (see JsonbValueToJsonb), and then we convert jsonb
to jsonbValue in jsonb_numeric (see JsonbExtractScalar). This
looks like a wastage.
Secondly, because of the same reason above, we use PG_GETARG_JSONB_P(0),
which may detoast a value so we need to free it with PG_FREE_IF_COPY.
then this looks like another potential wastage.
Thirdly, I am not sure we need to do the NumericCopy automatically
in jsonb_numeric. an option in my mind is maybe we can leave this
to the caller?  At least in the normal case (a->'a')::numeric, we don't
need this copy IIUC.
/*
 * v.val.numeric points into jsonb body, so we need to make a copy to
 * return
 */
retValue = DatumGetNumericCopy(NumericGetDatum(v.val.numeric));
At last this method needs 1 extra FuncExpr than my method, this would
cost some expression execution effort. I'm not saying we need to avoid
expression execution generally, but extracting numeric fields from jsonb
looks a reasonable case. As a comparison, cast to other data types like
int2/int4 may be not needed since they are not binary compatible.
Here is the performance comparison (with -O3, my previous post is -O0).
select 1 from tb where (a->'a')::numeric = 2;  31ms.
select 1 from tb where (a(at)->'a') = 2;  15ms
-- 
Best Regards
Andy Fan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2023-08-02 01:35:49 | Re: Performance degradation on concurrent COPY into a single relation in PG16. | 
| Previous Message | Peter Geoghegan | 2023-08-02 00:50:14 | Re: Use of additional index columns in rows filtering |