From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Extract numeric filed in JSONB more effectively |
Date: | 2023-08-03 13:52:40 |
Message-ID: | CAFj8pRD4cdUmK0RG4oN5B2KRSeDhwfMYaL=XpfEu4iaLeZ_Kow@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi
čt 3. 8. 2023 v 15:23 odesílatel Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> napsal:
> Hi:
>
>
>> More, I believe so lot of people uses more common syntax, and then this
>> syntax should to have good performance - for jsonb - (val->'op')::numeric
>> works, and then there should not be performance penalty, because this
>> syntax will be used in 99%.
>>
>
> This looks like a valid opinion IMO, but to rescue it, we have to do
> something like "internal structure" and remove the existing cast.
> But even we pay the effort, it still breaks some common knowledge,
> since xx:numeric is not a cast. It is an "internal structure"!
>
I didn't study jsonb function, but there is an xml function that extracts
value and next casts it to some target type. It does what is expected - for
known types use hard coded casts, for other ask system catalog for cast
function or does IO cast. This code is used for the XMLTABLE function. The
JSON_TABLE function is not implemented yet, but there should be similar
code. If you use explicit cast, then the code should not be hard, in the
rewrite stage all information should be known.
>
> I don't think "Black magic" is a proper word here, since it is not much
>>> different from ->> return a text. If you argue text can be cast to
>>> most-of-types, that would be a reason, but I doubt this difference
>>> should generate a "black magic".
>>>
>>
>> I used the term black magic, because nobody without reading documentation
>> can find this operator.
>>
>
> I think this is what document is used for..
>
>
>> It is used just for this special case, and the functionality is the same
>> as using cast (only with different performance).
>>
>
> This is not good, but I didn't see a better choice so far, see my first
> graph.
>
>
>>
>> The operator ->> is more widely used. But if we have some possibility to
>> work without it, then the usage for a lot of users will be more simple.
>> More if the target types can be based on context
>>
>
> It would be cool but still I didn't see a way to do that without making
> something else complex.
>
sure - it is significantly more work, but it should be usable for all types
and just use common syntax. The custom @-> operator you can implement in
your own custom extension. Builtin solutions should be generic as it is
possible.
The things should be as simple as possible - mainly for users, that missing
knowledge, and any other possibility of how to do some task just increases
their confusion. Can be nice if users find one solution on stack overflow
and this solution should be great for performance too. It is worse if users
find more solutions, but it is not too bad, if these solutions have similar
performance. It is too bad if any solution has great performance and others
not too much. Users has not internal knowledge, and then don't understand
why sometimes should to use special operator and not common syntax.
>
>
>>>> Maybe we can introduce some *internal operator* "extract to type", and
>>>> in rewrite stage we can the pattern (x->'field')::type transform to OP(x,
>>>> 'field', typid)
>>>>
>>>
>>> Not sure what the OP should be? If it is a function, what is the
>>> return value? It looks to me like it is hard to do in c language?
>>>
>>
>> It should be internal structure - it can be similar like COALESCE or IS
>> operator
>>
>
> It may work, but see my answer in the first graph.
>
>
>>
>>
>>>
>>> After all, if we really care about the number of operators, I'm OK
>>> with just let users use the function directly, like
>>>
>>> jsonb_field_as_numeric(jsonb, 'filedname')
>>> jsonb_field_as_timestamp(jsonb, 'filedname');
>>> jsonb_field_as_timestamptz(jsonb, 'filedname');
>>> jsonb_field_as_date(jsonb, 'filedname');
>>>
>>> it can save an operator and sloves the readable issue.
>>>
>>
>> I don't like it too much, but it is better than introduction new operator
>>
>
> Good to know it. Naming operators is a complex task if we add four.
>
>
>> We already have the jsonb_extract_path and jsonb_extract_path_text
>> function.
>>
>
> I can't follow this. jsonb_extract_path returns a jsonb, which is far
> away from
> our goal: return a numeric effectively?
>
I proposed `jsonb_extract_path_type` that is of anyelement type.
Regards
Pavel
> I can imagine to usage "anyelement" type too. some like
>> `jsonb_extract_path_type(jsonb, anyelement, variadic text[] )`
>>
>
> Can you elaborate this please?
>
> --
> Best Regards
> Andy Fan
>
From | Date | Subject | |
---|---|---|---|
Next Message | Andy Fan | 2023-08-03 14:02:10 | Re: Fix incorrect start up costs for WindowAgg paths (bug #17862) |
Previous Message | Andy Fan | 2023-08-03 13:50:15 | Re: Extract numeric filed in JSONB more effectively |