From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Extract numeric filed in JSONB more effectively |
Date: | 2023-08-03 10:04:18 |
Message-ID: | CAEze2WhMnkx=ASndH=UXd-W1yqBokHGMW6Tie3U3RWm69iRD1w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2 Aug 2023 at 03:05, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
> Hi Matthias:
>
> On Wed, Aug 2, 2023 at 7:33 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>>
>>
>>
>> On Tue, Aug 1, 2023 at 7:03 PM Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> wrote:
>>>
>>> On Tue, 1 Aug 2023 at 06:39, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>>> >
>>> > Hi:
>>> >
>>> > Currently if we want to extract a numeric field in jsonb, we need to use
>>> > the following expression: cast (a->>'a' as numeric). It will turn a numeric
>>> > to text first and then turn the text to numeric again.
>>>
>>> Why wouldn't you use cast(a->'a' as numeric), or ((a->'a')::numeric)?
>>
>>
>> Thanks for this information! I didn't realize we have this function
>> already at [1].
>>
>> https://www.postgresql.org/docs/15/functions-json.html
>
>
> Hi:
>
> I just found ((a->'a')::numeric) is not as effective as I expected.
>
> First in the above expression we used jsonb_object_field which
> returns a jsonb (see JsonbValueToJsonb), and then we convert jsonb
> to jsonbValue in jsonb_numeric (see JsonbExtractScalar). This
> looks like a wastage.
Yes, it's not great, but that's just how this works. We can't
pre-specialize all possible operations that one might want to do in
PostgreSQL - that'd be absurdly expensive for binary and initial
database sizes.
> Secondly, because of the same reason above, we use PG_GETARG_JSONB_P(0),
> which may detoast a value so we need to free it with PG_FREE_IF_COPY.
> then this looks like another potential wastage.
Is it? Detoasting only happens if the argument was toasted, and I have
serious doubts that the result of (a->'a') will be toasted in our
current system. Sure, we do need to allocate an intermediate result,
but that's in a temporary memory context that should be trivially
cheap to free.
> /*
> * v.val.numeric points into jsonb body, so we need to make a copy to
> * return
> */
> retValue = DatumGetNumericCopy(NumericGetDatum(v.val.numeric));
>
> At last this method needs 1 extra FuncExpr than my method, this would
> cost some expression execution effort. I'm not saying we need to avoid
> expression execution generally, but extracting numeric fields from jsonb
> looks a reasonable case.
But we don't have special cases for the other jsonb types - the one
that is available (text) is lossy and doesn't work reliably without
making sure the field we're accessing is actually a string, and not
any other type of value.
> As a comparison, cast to other data types like
> int2/int4 may be not needed since they are not binary compatible.
Yet there are casts from jsonb to and back from int2, int4 and int8. I
don't see a very good reason to add this, for the same reasons
mentioned by Pavel.
*If* we were to add this operator, I would want this patch to also
include a #-variant for text[]-based deep access (c.q. #> / #>>), and
equivalent operators for the json type to keep the current access
operator parity.
> Here is the performance comparison (with -O3, my previous post is -O0).
>
> select 1 from tb where (a->'a')::numeric = 2; 31ms.
> select 1 from tb where (a(at)->'a') = 2; 15ms
What's tb here?
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Laetitia Avrot | 2023-08-03 10:06:11 | Re: Adding a pg_servername() function |
Previous Message | Pavel Stehule | 2023-08-03 09:47:33 | Re: Extract numeric filed in JSONB more effectively |