From: | Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: jsonpath versus NaN |
Date: | 2020-06-15 10:50:11 |
Message-ID: | CAPpHfdtizzX++0DUVL0_CCJjp6Rd-NT9rmqe0PvL7-t62Eq0oA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 11, 2020 at 10:00 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
> > On Thu, Jun 11, 2020 at 3:45 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> It is entirely clear from the code, the documentation,
> >> and the relevant RFCs that JSONB does not allow NaNs as numeric
> >> values.
>
> > The JSONB itself doesn't store number NaNs. It stores the string "NaN".
>
> Yeah, but you have a numeric NaN within the JsonbValue tree between
> executeItemOptUnwrapTarget and convertJsonbScalar. Who's to say that
> that illegal-per-the-data-type structure won't escape to somewhere else?
> Or perhaps more likely, that we'll need additional warts in other random
> places in the JSON code to keep from spitting up on the transiently
> invalid structure.
I would propose to split two things: user-visible behavior and
internal implementation. Internal implementation, which allows
numeric NaN within the JsonbValue, isn't perfect and we could improve
it. But I'd like to determine desired user-visible behavior first,
then we can decide how to fix the implementation.
>
> > I found the relevant part of the standard. Unfortunately, I can't
> > post the full standard here due to its license, but I think I can cite
> > the relevant part.
>
> I don't think this is very relevant. The SQL standard has not got the
> concepts of Inf or NaN either (see 4.4.2 Characteristics of numbers),
> therefore their definition is only envisioning that a string representing
> a normal finite number should be castable to DOUBLE PRECISION. Thus,
> both of the relevant standards think that "numbers" are just finite
> numbers.
>
> So when neither JSON nor SQL consider that "NaN" is an allowed sort
> of number, why are you doing violence to the code to allow it in a
> jsonpath?
Yes, I see. No standard insists we should support NaN. However,
standard claims .double() should behave the same as CAST to double.
So, I think if CAST supports NaN, but .double() doesn't, it's still a
violation.
> And if you insist on doing such violence, why didn't you
> do some more and kluge it to the point where "Inf" would work too?
Yep, according to standard .double() should support "Inf" as soon as
CAST to double does. The reason why it wasn't implemented is that we
use numeric as the internal storage for all the numbers. And numeric
doesn't support Inf yet.
> (It would require slightly less klugery in the wake of the infinities-
> in-numeric patch that I'm going to post soon ... but that doesn't make
> it a good idea.)
If numerics would support infinites, we can follow standard and make
.double() method work the same way as CAST to double does. Now, I get
that there is no much reason to keep current behaviour, which supports
Nan, but doesn't support Inf. I think we should either support both
NaN and Inf and don't support any of them. The latter is a violation
of the standard, but provides us with a simpler and cleaner
implementation. What do you think?
BTW, we found what the standard says about serialization of SQL/JSON items.
9.37 Serializing an SQL/JSON item (page 695)
ii) Let JV be an implementation-dependent value of type TT and
encoding ENC such that these two conditions hold:
1) JV is a JSON text.
2) When applying the General Rules of Subclause 9.36, “Parsing JSON
text” with JV as JSON TEXT, FO as FORMAT OPTION, and WITHOUT UNIQUE
KEYS as UNIQUENESS CONSTRAINT, the returned STATUS is successful
completion and the returned SQL/JSON ITEM is an SQL/JSON item that is
equivalent to SJI.
If there is no such JV, then let ST be the exception condition: data
exception — invalid JSON text.
Basically it says that the resulting text should result in the same
SQL/JSON item when parsed. I think this literally means that
serialization of numeric NaN is impossible as soon as it's impossible
to get numeric NaN as the result json parsing. However, in the same
way this would mean that serialization of datetime is also impossible,
but that seems like nonsense. So, I think this paragraph of the
standard is ill-conceived.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-06-15 11:02:22 | Re: Transactions involving multiple postgres foreign servers, take 2 |
Previous Message | Fujii Masao | 2020-06-15 09:59:49 | Re: Review for GetWALAvailability() |