Re: SQLJSON

From: Christopher BROWN <brown(at)reflexe(dot)fr>
To: Dave Cramer <pg(at)fastcrypt(dot)com>
Cc: Markus KARG <markus(at)headcrashing(dot)eu>, List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: SQLJSON
Date: 2015-06-29 21:19:17
Message-ID: CAHL_zcM+WHVSFoYFHq8GphmaLAbs0dsBvsM-iUSeUFZ38SmkZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hello,

In summary, is this what is planned ?

- Bundle the JSONP API with the driver (specific version)
- Bundle the JSONP RI with the driver (specific version, with modified
package names)
- Override getObject() to return a "JSON DOM" (buffering data once, then
reconstructing it again as an object model)

Is that correct? Could you remind me what does getObject() currently
return for JSONB (I don't have a test case to hand)?

Thanks,
Christopher

On 29 June 2015 at 22:58, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:

> On 29 June 2015 at 16:22, Markus KARG <markus(at)headcrashing(dot)eu> wrote:
>
>> Dave,
>>
>>
>>
>> I understand that people expect that getObject does something reasonable
>> - but the case why this does not happen is because PostgreSQL does not
>> provide something reasonable AFAIK, or is there JSON support in the
>> protocol (I might have missed that, actually, then please just tell me)?
>>
>>
> Well things in PostgreSQL move at rather a glacial pace as far as
> features go. The emphasis is on reliability. If you think this mailing list
> is contentious you might want to have a look at hackers. To answer your
> question though there is no special backend support for JSON, or any other
> streaming. Yes this is a performance drawback, but it is what it is.
>
>>
>>
>> But I need to tell you that you have to deal with the risk that if JDBC 5
>> comes up with the definition "getObject has to return FancyJsonType" you'll
>> face a backwards compatibility problem when you decided for pretty anything
>> to return but not "FancyJsonType", whether the type you've choosen looked
>> obvious to the common user, to almost any user or to nobody at all.
>> Anyways, you're the boss, and I trust in you to do the right choice. :-)
>>
>
> Doing nothing is surely the wrong choice. The question is what to do. I
> don't pretend to know the right answer here.However in the absence of the
> spec, I'm sure whatever we do will be a compromise.
>
>>
>>
>> Regarding your metaphor, please in turn understand thar some guys in this
>> form, like Vladimir and me, are professional combustion engine engineers in
>> our main job. Our whole life is dedicated to the difference between a
>> vanilla BMW 318 into an M3. I hope there is place for such engineers in
>> your team, too, and it is _not_really_ your attitude that you are happy
>> once the weels don't fall away once you start the motor. We do not do it
>> for thank or cheer, we simply do it because it is the attitude of our main
>> jobs. We cannot forget our knowledge about compression and consumption just
>> because 99% of the users don't care for. And part of this knowledge is
>> thinking of efficiency and safety and cost of production and unassembly for
>> recycling, and usefulness and maintainability and testability of the car,
>> too, even if no user _ever_ would think about when sitting is his ride. If
>> our demands on the professionality and performance of the end product is
>> not what you like to achieve and you like to simply replace the pipe with a
>> chromed one, then please let us know. Otherwise our PRs will one day
>> produce a stack overflow on GitHub, which nobody wants. ;-)
>>
>
> As I said in my earlier email I am grateful for your contributions and I
> would love them to continue, but we need to keep the majority of the users
> in mind. We also need to make sure in our quest for performance that we
> don't break anything.
>
> Regards,
>
> Dave
>
>>
>>
>> Have fun
>>
>> -Markus
>>
>>
>>
>>
>>
>> *From:* pgsql-jdbc-owner(at)postgresql(dot)org [mailto:
>> pgsql-jdbc-owner(at)postgresql(dot)org] *On Behalf Of *Dave Cramer
>> *Sent:* Montag, 29. Juni 2015 21:23
>> *To:* Álvaro Hernández Tortosa
>> *Cc:* List
>> *Subject:* Re: [JDBC] SQLJSON
>>
>>
>>
>> Markus,
>>
>>
>>
>> I really value your recent input to the driver and would like it to
>> continue but we need to keep things in perspective. The mere fact that you
>> are engaging in discussion on this list puts you in the 99.99999th (keep
>> going with 9's if you like) percentile of users. Most and I mean the rest
>> of the people in that population just expect the driver to do something
>> reasonable. They don't provide JSON parsers or even know there is a
>> difference. When they do getObject and that object is a json value they
>> expect something magical to happen. We return an object they can use.
>>
>>
>>
>> The other less than exciting fact about the driver is that it is not the
>> central feature of PostgreSQL. Allow me a metaphor if you will. I like cars
>> so I'll chose to use a driveshaft. It connects the engine to the wheels.
>> There is nothing new about driveshafts, although recently they have been
>> made out of carbon fibre, but essentially they are tubes which connect the
>> crankshaft to the wheels. To continue the metaphor, when the driver applies
>> the gas pedal they expect the car to move. If the driveshaft breaks then
>> this doesn't happen. Our job is to make sure the driveshaft continues to
>> connect the engine to the wheels. If we can make it out of carbon fibre so
>> much the better, but it is a particularly thankless job.
>>
>>
>>
>>
>>
>> Kind Regards,
>>
>>
>> Dave Cramer
>>
>> dave.cramer(at)credativ(dot)ca
>> http://www.credativ.ca
>>
>>
>>
>> On 29 June 2015 at 15:01, Álvaro Hernández Tortosa <aht(at)8kdata(dot)com>
>> wrote:
>>
>>
>> Markus:
>>
>> On 29/06/15 18:26, Markus KARG wrote:
>>
>> Unfortunately, yes, it is weird to ask, as what you do actually is
>> working already once you accept the technical reality that an application
>> either has to use JSON SQL functions OR provide an implementation of the
>> JSONP API to be able to process JsonObject.
>>
>>
>> It was rhetorical. Precisely I'm saying let's include an
>> implementation so that you could process a JsonValue. Of course.
>>
>>
>> I actually cannot see what is so specific to PostgreSQL that pgjdbc has
>> to support a use case that no other driver supports.
>>
>>
>> Maybe that PostgreSQL has JSON, really good one, while others don't?
>> Or let's ruin all the work done into JSON just because you fear of an
>> absolutely unrealistic class path? Really? Please....
>>
>>
>>
>>
>> The argument against it is that it provides classpath clashes once pgjdbc
>> is used in a Java EE environment, and that our solution might be
>> incompatible with JDBC 5, and that our solution imposes work and complexity
>> on the pgjdbc contributors just for the sake to make you happy, without
>> providing you any measurable benefit.
>>
>>
>> Markus, when you say "to make you happy".... please take back your
>> words and stick to rational arguments. If you cannot, please abandon this
>> otherwise respectful and enriching debate.
>>
>> I'm speaking for all the users, all of them that want to use 9.4's
>> best thing (with logical decoding permission, of course).
>>
>> Yet you only want to stop everything speaking of a class path that
>> very likely no one will ever experience, or some fears about JDBC5 which
>> might be light years ahead. We will worry about JDBC5 then, but now we need
>> to help our users, not help them (with your help) to go to MongoDB.
>>
>>
>>
>>
>> Álvaro
>>
>>
>>
>> --
>>
>> Álvaro Hernández Tortosa
>>
>>
>>
>>
>>
>> -----------
>>
>> 8Kdata
>>
>>
>>
>>
>>
>>
>>
>>
>> -Markus
>>
>>
>>
>> *From:* pgsql-jdbc-owner(at)postgresql(dot)org [
>> mailto:pgsql-jdbc-owner(at)postgresql(dot)org <pgsql-jdbc-owner(at)postgresql(dot)org>]
>> *On Behalf Of *Álvaro Hernández Tortosa
>> *Sent:* Sonntag, 28. Juni 2015 21:06
>> *To:* pgsql-jdbc(at)postgresql(dot)org
>> *Subject:* Re: [JDBC] SQLJSON
>>
>>
>>
>>
>>
>> On 28/06/15 17:09, Markus KARG wrote:
>>
>> If your application does not want to deal with JSON data then don't
>> request it from the database, but use
>> http://www.postgresql.org/docs/9.4/static/functions-json.html in your
>> SELECT statement. Performing a JSON extraction function natively on the
>> server and simply use the existing data types in your application not only
>> reliefs us from dealing with JSON in the driver, but also will work
>> magnitudes faster particularly on congested networks. That does work by
>> default already in the existing driver. What you ask for does not provide
>> any benefit to the user from my view, or I still do not understand you
>> exact scenario.
>>
>>
>> Markus, here you are trying to tell users what to do. I prefer to
>> give them freedom, works best.
>>
>> All I'm saying is:
>>
>> - Give users an API to get a javax.json.JsonValue out of a PostgreSQL
>> column which is of type json, jsonb or the result of an expression which
>> evaluates to any of those.
>>
>> - Embed a JSON parser in the driver (like the RI implementation, which
>> adds only 64Kb) so that users don't need to load any other code, unless
>> they want to override the default JSON parser.
>>
>> From there, I don't care whether the user uses JSON functions at the
>> server or the JSON API. What I want is that it works by default and that
>> you can query JSON. Is this that weird to ask?
>>
>> I'm sorry, I fail to understand your solution to the problem. Would
>> you please mind elaborating on the drawbacks of my proposal? I stated in
>> previous emails the drawbacks I see on not doing it, but I still fail to
>> see an argument against this.
>>
>> Thanks,
>>
>> Álvaro
>>
>>
>> --
>>
>> Álvaro Hernández Tortosa
>>
>>
>>
>>
>>
>> -----------
>>
>> 8Kdata
>>
>>
>>
>>
>>
>>
>>
>> *From:* pgsql-jdbc-owner(at)postgresql(dot)org [
>> mailto:pgsql-jdbc-owner(at)postgresql(dot)org <pgsql-jdbc-owner(at)postgresql(dot)org>]
>> *On Behalf Of *Álvaro Hernández Tortosa
>> *Sent:* Sonntag, 28. Juni 2015 11:57
>> *To:* pgsql-jdbc(at)postgresql(dot)org
>> *Subject:* Re: [JDBC] SQLJSON
>>
>>
>>
>>
>>
>> On 28/06/15 11:49, Markus KARG wrote:
>>
>> You miss one essential point against bundling: An application author who
>> wants to process JsonObject must have that class on his classpath, even if
>> he does not use JDBC at all. This essential is HIS problem as HE wants to
>> process it. So where does he get that class from other than putting any JSR
>> 253 implementation in his classpath? Again, it is simply not pgjdbc's
>> problem as in that scenario no JDBC is used at all.
>>
>> I don't agree, it's not a must. I may want to extract JSON data from
>> the database and then manipulate it as non-JSON data. It all depends on
>> your domain Objects.
>>
>> I already stated how bad for the user is not to have a driver that
>> works by default. It may be as hard as adding a 64Kb to the driver. I don't
>> understand how is this a problem, vs. the problem it creates for the user.
>>
>> Regards,
>>
>> Álvaro
>>
>>
>>
>> --
>>
>> Álvaro Hernández Tortosa
>>
>>
>>
>>
>>
>> -----------
>>
>> 8Kdata
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Álvaro Hernández Tortosa [mailto:aht(at)8Kdata(dot)com <aht(at)8Kdata(dot)com>]
>>
>> *Sent:* Sonntag, 28. Juni 2015 11:41
>> *To:* Christopher BROWN
>> *Cc:* Sehrope Sarkuni; Markus KARG; Dave Cramer; List
>> *Subject:* Re: [JDBC] SQLJSON
>>
>>
>>
>>
>>
>> On 28/06/15 09:34, Christopher BROWN wrote:
>>
>> Hello,
>>
>>
>>
>> Quick contribution, I'm not answering in-line because there are already
>> too many in-line answers and it's getting unreadable.
>>
>> - In my own applications, I use Jackson, but it's many up of
>> different ".jar" files and has therefore no definitive form (you can
>> concoct lots of combinations). It's also quite heavy in terms of
>> footprint, and embedding it makes no sense, because you'd have to keep
>> updating the driver to keep up to date with Jackson. Finally, it doesn't
>> actually implement JSR353 (although it would be possible to create a thin
>> wrapper), out-of-the-box (via a compatibility API) it can read JSR-353 but
>> it basically rebuilds a Jackson representation out of a "standard"
>> representation. I might choose Jackson, but I wouldn't want to impose it
>> or require that it be bundled with the driver (indeed, that would cause me
>> classloader issues as I often update to the latest version of Jackson).
>>
>> Although I mentioned Jackson as a good candidate for a default
>> implementation, you are right it does not implement JSR353 directly
>> (although wrappers already exist, like
>> https://github.com/pgelinas/jackson-javax-json) But it does not need to
>> be the default choice. I did a quick test and wrapping Jackson with
>> jaackson-javax-json and the set of dependencies to make it work would add
>> 1102Kb to the Jar file. Not much IMHO, although bigger than current driver
>> size. I would not be scared however to see a 2Mb jdbc jar file.
>>
>> However, the Reference Implementation (https://jsonp.java.net/) is
>> probably good enough and only adds 64Kb to the Jar file. The JSR353 is just
>> another 32Kb, so in total 96Kb would be added if using the RI rather than
>> Jackson. I don't consider this offensive.
>>
>> In summary: why not bundle then the RI? Works out of the box and does
>> not conflict with Jackson. Want to use Jackson? Go for it. The rest of the
>> world would have something running out-of-the-box.
>>
>>
>>
>> - You can compile the driver against the JSONP API without embedding
>> either the interfaces or an implementation. It's therefore an optional
>> feature for those that require it, and it's not rocket science to add the
>> necessary APIs to the classpath.
>> - I disagree that bundling interfaces + implementation is "making it
>> easy". For some users, perhaps, but for others, you're going to cause
>> headaches due to creating classloader conflicts (when it's already bundled
>> in their application).
>>
>> Technically, it's possible. But from a user perspective, not bundling
>> an implementation means:
>>
>> - Understanding why the code I wrote fails with a "Provider
>> org.glassfish.json.JsonProviderImpl not found".
>> - Google that. Find that you need to add the dependency and very likely
>> create a SPI file (META-INF/services/javax.json.spi.JsonProvider).
>> - Blame PostgreSQL for not doing that by default and shipping a
>> half-baked driver that is not finished (won't be my opinion, but might be
>> user's opinion).
>> - Google again to see what JSR353-compliant implementations are out there.
>> - Blame PostgreSQL again for not making this choice for you.
>> - Struggling to find a compatible implementation. Find Jackson but
>> realize requires third-party driver. Question the "quality" of that wrapper
>> and consider whether that would be "supported" with PostgreSQL driver.
>> - Luckily enough a search points you to a StackOverflow link that
>> suggests to use either this Jackson wrapper or the Reference Implementation
>> (there are not many JSR353 implementations, after all).
>> - Choose without knowing which JSON parser is best.
>> - Bundle the dependency, check now it works. Clean your sweat.
>> - Wonder why not to choose MongoDB next time, it works out of the box.
>>
>> Not funny.
>>
>> The alternative is: bundle a default (the RI) and let advanced users
>> not happy with the default implementation to make another choice (basically
>> create the META-INF/services/javax.json.spi.JsonProvider and that will
>> override the bundled RI). You have the best of both worlds.
>>
>> Pareto (https://en.wikipedia.org/wiki/Pareto_principle) let's make
>> it work for 80% and let that 20% to edit a file and "suffer" a 64Kb size
>> increase in the driver, rather than ask everybody to go through the above
>> process.
>>
>>
>>
>>
>> - If as Dave Cramer says, the PG protocol doesn't currently support
>> streaming, it still makes sense to add streaming support that reads from
>> already fully-read resources... because in that way, if the protocol
>> improves in the future, client code using the streaming API will benefit
>> (with no changes to client code) in the future.
>>
>> JSR353 already has a Streaming API. I don't understand what do we
>> need to do here, it's already done.
>>
>> Best regards,
>>
>> Álvaro
>>
>>
>>
>>
>> --
>>
>> Álvaro Hernández Tortosa
>>
>>
>>
>>
>>
>> -----------
>>
>> 8Kdata
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Christopher
>>
>>
>>
>>
>>
>> On 28 June 2015 at 01:53, Álvaro Hernández Tortosa <aht(at)8kdata(dot)com>
>> wrote:
>>
>>
>>
>> On 28/06/15 00:55, Sehrope Sarkuni wrote:
>>
>> On Sat, Jun 27, 2015 at 6:25 PM, Álvaro Hernández Tortosa <aht(at)8kdata(dot)com>
>> wrote:
>>
>> Hi Sehrope!
>>
>>
>>
>> Hi Álvaro! :D
>>
>>
>>
>> To me, this is the least important question. If based on JSR353's
>> SPI, it's trivial to swap the default, included one, for another one. Just
>> picking a sensible default (Jackson, for instance) is probably good enough.
>>
>>
>>
>> I think I've used Jackson almost every time I've had to deal with JSON in
>> Java. The mapping API is pretty cool in that it lets you directly create an
>> target object type. If we got the route of adding methods to PGResultSet
>> then we could have something like: <T> T getJsonAsType(String, Class<T>
>> clazz)
>>
>>
>> That might be a nice addition. But I believe that goes beyond
>> driver's responsibility: I think it ends when it returns you the JSON type
>> you queried (JsonObject in my previous email, but I'm correcting now
>> myself: JsonValue)
>>
>>
>>
>>
>>
>> I'm not wedded to Jackson though. Honestly if JS353 is the standard then
>> that's what we should be using. We'd still need to figure out how to handle
>> older JVMs or maybe just selectively disable the feature (JDK8+?).
>>
>>
>> JSR353 is targeted for JavaSE 6 :)
>>
>>
>>
>>
>>
>>
>> #2 is driven a lot by #1 as depending on the parser implementation there
>> may be different object types returned. JSON is a bit tricky as "valid
>> JSON" can mean null, a scalar, an object, or an array. Most people thing of
>> it as just an object but "foobar" is valid JSON as well. This leads us to
>> #3...
>>
>>
>> The object type to return has to be IMHO JsonObject:
>> http://docs.oracle.com/javaee/7/api/javax/json/JsonObject.html
>>
>>
>>
>> Not always though. All these are valid JSON too:
>>
>>
>>
>> => SELECT '1'::json AS num, '"test"'::json AS string, '[1,2,3]'::json AS
>> arr, '{"foo":"bar"}'::json AS obj;
>>
>> num | string | arr | obj
>>
>> -----+--------+---------+---------------
>>
>> 1 | "test" | [1,2,3] | {"foo":"bar"}
>>
>> (1 row)
>>
>>
>>
>> We'll need separate getters/setters for the scalar and array types as
>> well. I agree that most people will just be using the object type though
>> (and maybe the array).
>>
>>
>> You are right here. Please s/JsonObject/JsonValue/g JsonValue is a
>> container for any of the above including objects and arrays. So it would be
>> enough just with JsonValue getJsonValue(....)
>>
>>
>>
>>
>>
>> #3 doesn't have a straight answer as there is no getJSON(...) methods in
>> the JDBC spec. It'd probably have to be returned via getObject(...).
>>
>> An alternative is to provide PGResultSet and PGPreparedStatement classes
>> similar to PGConnection that provides PG extensions. They could have the
>> get/set methods (ex: getJsonScalar(...) or setJsonObject(Map<String,Object>
>> ...)) to retrieve JSON values as specific object types (i.e. scalar,
>> object, array). It'd be a bit more type safe as presumably most people
>> using json/jsonb types know the top level type of what they're storing.
>>
>>
>> Probably adding methods to PG classes would be better than getObject
>> and force explicit casts. Regarding the methods, if they simply return
>> JsonObject, you already have a full API there to parse and extract and
>> process. So anything that returns a JsonObject from a column (identifier or
>> #) would be enough for me.
>>
>>
>>
>> For most cases I think it'd be fine. I think the custom mapping I
>> mentioned above would cover the rest. Anything beyond that would be a full
>> on transformation and would be very application specific.
>>
>>
>> Yepp
>>
>>
>>
>>
>>
>> For #4 I see two possible wins. First off on the usability side, there's
>> some convenience to natively interfacing with json/jsonb types. It'll only
>> have value though if those types are the same ones that users are using in
>> the rest of their code. If they're just using them as Map<String,Object>
>> everywhere then it'd still be a pain for a user to convert to our "native"
>> PG JSON types to use via JDBC. Having a dedicated API that allows for
>> interaction using native Java types would make this more convenient.
>>
>> The other win I can see for #4 is on performance. Right now JSON is
>> converted to a String. That means everybody using it has to convert it
>> twice. First raw bytes to String, then String to object. A dedicated API
>> could cut one of those out of the way. Given how the wire protocol is
>> implemented in the driver, it wouldn't be a direct reading from the input
>> stream (it'll be buffered in a byte array), but at least it won't be copied
>> twice.
>>
>>
>> As far as I know, most users are using JsonObject, so returning that
>> is a perfect match for pgjdbc. I don't expect however big performance wins
>> as JSON is sent as a String over the wire...
>>
>>
>>
>> The performance gain isn't on the wire, it's from not having to convert
>> bytes => String => JsonObject. It'd be bytes => JsonObject or bytes =>
>> CustomObject. Less work and less GC. The bigger the JSON string, the bigger
>> the savings too.
>>
>>
>> You are right in that JSR353 allows you to create a parser directly
>> out of an InputStream, so you would avoid converting to String. That's a
>> win. The rest of the conversions are inevitable (having the latter one you
>> pointed out laying in user's realm, beyond driver's responsibility).
>>
>> Regards,
>>
>> Álvaro
>>
>>
>>
>> --
>>
>> Álvaro Hernández Tortosa
>>
>>
>>
>>
>>
>> -----------
>>
>> 8Kdata
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2015-06-29 21:21:26 Re: SQLJSON
Previous Message Dave Cramer 2015-06-29 20:58:35 Re: SQLJSON