From: | Mikko Tiihonen <mikko(dot)tiihonen(at)iki(dot)fi> |
---|---|
To: | Kris Jurka <jurka(at)ejurka(dot)com> |
Cc: | pgsql-jdbc(at)postgresql(dot)org |
Subject: | Re: binary tuple receiving patch v4 |
Date: | 2006-11-26 23:19:05 |
Message-ID: | Pine.OSF.4.64.0611261325430.272415@kosh.hut.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
On Sun, 26 Nov 2006, Kris Jurka wrote:
> Mikko Tiihonen wrote:
> >
> > I'm mailing directly to you because lately my mails that I have sent to
> > pgsql-jdbc(at)postgresql(dot)org have just vanished. I used to get a mail
> > immediately telling that the my mail has been blocked for manual
> > approval, but for the past week that has no longer happened.
>
> The list has a maximum message size of 30k. I thought people got an error
> when they sent large messages, but apparently not. I've contacted the mail
> admins and hopefully they'll get it resolved. For the moment I've put up your
> original patch (v4) and all additional patches discussed below here:
>
> http://www.ejurka.com/pgsql/patches/binary-receive/
>
> The ones starting k4-X- are incremental diffs upon your v4 patch and I've
> rolled them all up into a complete v5 version.
Thanks. I'll catch up with that one.
> > Found some more time. Now I have been mainly trying to fix more of the
> > corner cases and done quite a lot of testing and playing around with
> > time/date formats. I hope I got them working correctly, but on the other
> > hand maybe I just wrote bad test cases.
> >
>
> Well you need to at least run the existing regression tests which revealed a
> number of problems (1-3 below):
>
> k4-1 Code was not correctly setting the wasNullFlag, so after retrieving a
> null value the call to ResultSet.wasNull() did not return true.
>
> k4-2 When a query is retrieved in chunks (Statement.setFetchSize()), it will
> keep executing until nothing more is returned. The test it uses includes a
> check to see if it got any new Fields. Here the already cached Fields are not
> new, so we don't count them.
>
> k4-3 The code to determine whether to convert the fields to binary is not
> aware of when the query is reparsed, so it can get out of sync. In general
> trying to run this in AbstractJdbc2Statement is the wrong place. Binary
> results are really limited to v3, so I tried to only put code into
> v3/QueryExecutorImpl and v3/SimpleQuery.
Good cleaup. It now start to look almost manageable without need for
novels in comments explaining my hacks.
> k4-4 Add support for setting binaryEncoding in the regression test suite by
> adjusting build.local.properties. I'm skeptical of how much testing this
> actually gets us because the usage pattern that results in binary sends is not
> one commonly found in the tests.
>
> k4-5 New URL parameters need to be documented in DriverPropertyInfo.
>
> Open issues/questions:
>
> 1) The 7.4 server does not report integer_datetimes with a ParameterStatus
> message. This was only added in the 8.0 release. There may be another way to
> get this info, there is the brute force, retrieve a known value and see if
> it's an int or float. Disabling datetime types for 7.4 is also an option.
I would be inclined to go to the latter. The included
pg74-disable-binaryencoding-for-times.patch does just that.
> 2) Why doesn't AbstractJdbc2ResultSet.readDoubleValue do overflow checking?
Because I tried to match the existing code which doesn't do any overflow
checking for getFloat and getDouble either.
Anyway, I attached a test case that tests min/max values of all numeric
fields and also tries to test all overflow and underflow cases.
Look at the filterBugs method to see what kind of values I had to skip when
testing with test parsing. I also added full overflow checking for
getFloat and getDouble from binary encodings.
> 3) Can binary timezones support infinity?
That wery much depends on the backend. I tried to ask in my previous mails
wheather the backend even had such a notion and wheather the jdbc drivers
current infinity value matched the backend or not.
OK. I just tested and the backend does seem to accept an infinite for a
timestamp, I'll create a patch to convert it to same value that the
current api uses to represent such value.
> 4) Why did you decide to use a BitSet instead of a HashSet to represent
> whether an oid my be transferred as binary? It seems like a waste of space
> unless they support a sparse representation.
Purely for performance reasons. The BitSet uses less than 200 bytes per
connection, but has a very fast get operation (two shifts and an 'and'),
whereas using HashSet<Integer> would mean creating a new Integer for every
lookup and calculating hashCodes and other things that HashMap need to do.
I doubt one can really see the difference in real life, but I'm trying
to create other cleanup patches that remove as many unnecessary object
creations from the receive paths as possible.
The BitSet does not have sparese representation, but on the other hand
one cannot store many Integer objects into 200 bytes either. So I think
that at least on 64bit machines the BitSet could already consume less
memory.
> 5) I haven't looked at a lot of the actual conversion code, but
> TimestampUtils.toJavaSecs does not make me feel warm and fuzzy. I'll spend
> some more time on this later.
I admit it does not look nice. And I also admit that I have not tested it
works in all different locales (or if the locale can alter what postgresql
or java Calendar do with the conversion).
The problem is that while both the jvm and the postgresql keep time as
fractional seconds since (their own selected) epoc the time differs in
how the gregorian/julian calendar jump is taken into account. They also
differ in how the leap years are calculated when using the julian calendar.
Only persons doing stuff with historical dates should hit the problems, and
maybe then they should not even be using timestamp/date to do it but roll
their own representation for varchar. For java a good library for that is
http://joda-time.sourceforge.net/index.html but I hope it is not needed
by the postgresql jdbc driver.
> 6) It would be good for datasources to support the binaryEncoding parameter.
> I'm currently doing some work in that area, so I'll take a look at this.
Thanks. I have never had time to study how to use the DataSource interfaces.
> 7) Is binaryEncoding a good name for the parameter? I don't want people to
> confuse this with character sets which seem to give people fits already.
I have no strong feelings for the name of the parameter. How about changing it
to binaryTransfer? Another thing is that we might later want to send also
binary data to the backend. Would we then want to have a different option for
it or would it use the same option?
> 8) Just thought of one more: What about updatable ResultSets? The update
> will write a text value into the backing store, but it will try to read it
> as binary.
I think we should just avoid using binary encoding for updatable result test
for now. At least until someone has implemented sending using binary
encoding that is.
Thanks for all the comments about the patch. It is such a large change
that we should be pretty sure it does not break anything before it is
taken into use.
-Mikko
Attachment | Content-Type | Size |
---|---|---|
pg74-disable-binaryencoding-for-times.patch | text/plain | 1.3 KB |
numericOverflows.patch | text/plain | 17.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Peters | 2006-11-26 23:21:58 | Re: Exception Error - Please help I've tried everything! |
Previous Message | matrixx | 2006-11-26 22:21:34 | Re: NPE in BlobInputStream |