From: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
---|---|
To: | Chapman Flack <chap(at)anastigmatix(dot)net> |
Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Extract numeric filed in JSONB more effectively |
Date: | 2023-08-31 09:10:39 |
Message-ID: | CAKU4AWpM=C5NxumoHQB1=id0OBc63kUgMUWyRNO2=OrHj6ROyQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Chap,
The v11 attached, mainly changes are:
1. use the jsonb_xx_start and jsonb_finish_numeric style.
2. improve the test case a bit.
It doesn't include:
1. the jsonb_finish_text function, since we have a operator ->> for text
already and the performance for it is OK and there is no cast entry for
jsonb to text.
2. the jsonb_finish_jsonb since I can't see a clear user case for now.
Rewriting jsonb_object_field with 2 DirectFunctionCall looks not pretty
reasonable as we paid 2 DirectFunctionCall overhead to reduce ~10 lines
code duplication.
An incompatible issue at error message level is found during test:
create table jb(a jsonb);
insert into jb select '{"a": "a"}'::jsonb;
select (a->'a')::int4 from jb;
master: ERROR: cannot cast jsonb string to type *integer*
patch: ERROR: cannot cast jsonb string to type *numeric*
That's mainly because we first extract the field to numeric and
then cast it to int4 and the error raised at the first step and it
doesn't know the final type. One way to fix it is adding a 2nd
argument for jsonb_finish_numeric for the real type, but
it looks weird and more suggestions on this would be good.
Performance comparison between v10 and v11.
create table tb (a jsonb);
insert into tb select '{"a": 1}'::jsonb from generate_series(1, 100000)i;
select 1 from tb where (a->'a')::int2 = 2; (pgbench 5 times)
v11: 16.273 ms
v10: 15.986 ms
master: 32.530ms
So I think the performance would not be an issue.
> I noticed there is another patch registered in this CF: [1]
> It adds new operations within jsonpath like .bigint .time
> and so on.
>
> I was wondering whether that work would be conflicting or
> complementary with this. It looks to be complementary. The
> operations being added there are within jsonpath evaluation.
> Here we are working on faster ways to get those results out.
>
> It does not seem that [1] will add any new choices in
> JsonbValue. All of its (.bigint .integer .number) seem to
> verify the requested form and then put the result as a
> numeric in ->val.numeric. So that doesn't add any new
> cases for this patch to handle. (Too bad, in a way: if that
> other patch added ->val.bigint, this patch could add a case
> to retrieve that value without going through the work of
> making a numeric. But that would complicate other things
> touching JsonbValue, and be a matter for that other patch.)
>
> It may be expanding the choices for what we might one day
> find in ->val.datetime though.
>
> Thanks for this information. I tried the jsonb_xx_start and
jsonb_finish_numeric style, and it looks like a good experience
and it may not make things too complicated even if the above
things happen IMO.
Any feedback is welcome.
--
Best Regards
Andy Fan
Attachment | Content-Type | Size |
---|---|---|
v11-0001-optimize-casting-jsonb-to-a-given-type.patch | application/octet-stream | 23.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jelte Fennema | 2023-08-31 09:13:00 | pg_basebackup: Always return valid temporary slot names |
Previous Message | Jelte Fennema | 2023-08-31 09:01:07 | Re: Allow specifying a dbname in pg_basebackup connection string |