Re: Extract numeric filed in JSONB more effectively

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Extract numeric filed in JSONB more effectively
Date: 2024-03-09 23:16:40
Message-ID: 87o7bn7z56.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> But I have a different question about this patch set. This has some
>> overlap with the JSON_VALUE function that is being discussed at
>> [0][1]. For example, if I apply the patch
>> v39-0001-Add-SQL-JSON-query-functions.patch from that thread, I can run
>>
>> select count(*) from tb where json_value(a, '$.a' returning numeric) = 2;
>>
>> and I get a noticeable performance boost over
>>
>> select count(*) from tb where cast (a->'a' as numeric) = 2;
>
> Here is my test and profile about the above 2 queries.
>
..
> As we can see the patch here has the best performance (this result looks
> be different from yours?).
>
> After I check the code, I am sure both patches *don't* have the problem
> in master where it get a jsonbvalue first and convert it to jsonb and
> then cast to numeric.
>
> Then I perf the result, and find the below stuff:
>
..

> JSONB_VALUE has a much longer way to get getKeyJsonValueFromContainer,
> then I think JSON_VALUE probably is designed for some more complex path
> which need to pay extra effort which bring the above performance
> difference.

Hello Peter,

Thanks for highlight the JSON_VALUE patch! Here is the sistuation in my
mind now.

My patch is desigined to *not* introducing any new user-faced functions,
but let some existing functions run faster. JSON_VALUE patch is designed
to following more on SQL standard so introuduced one new function which
has more flexibility on ERROR handling [1].

Both patches are helpful on the subject here, but my patch here has a
better performance per my testing, I don't think I did anything better
here, just because JSON_VALUE function is designed for some more generic
purpose which has to pay some extra effort, and even if we have some
chance to improve JSON_VALUE, I don't think it shoud not block the patch
here (I'd like to learn more about this, it may takes some time!)

So I think the my patch here can be go ahead again, what do you think?

[1]
https://www.postgresql.org/message-id/CACJufxGtetrn34Hwnb9D2if5D_HOPAh235MtEZ1meVYx-BiNtg%40mail.gmail.com

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-10 00:20:06 Vectored I/O in bulk_write.c
Previous Message Andy Fan 2024-03-09 23:12:27 Re: remaining sql/json patches