From: | Tom Dunstan <pgsql(at)tomd(dot)cc> |
---|---|
To: | "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org> |
Subject: | Reading and writing off-heap data |
Date: | 2017-09-21 05:43:08 |
Message-ID: | CAPPfruxfBLM=gnyW-y1-ioPvZ+3i70_XARk0eX+HBdEtk7YPjw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
Hi all
After original discussion back in March[1] we've finally gotten around to
scheduling this work. I've submitted a pull-request to support writing data
from off-heap locations using the discussed interface here [2].
We'd like to do the flip of this too, though: read incoming data into a
caller-controlled buffer in some way. The interface would look something
like this:
// provided by driver
interface ByteStreamReader<T implements Closeable> {
T readByteStream(int length, InputStream stream) throws IOException;
}
// user code
class MyCustomByteStreamReader implements ByteStreamReader<MyBufferHandle> {
...
}
preparedStatement.registerByteStreamReader(new MyCustomByteStreamReader());
...
MyBufferHandle b = (MyBufferHandle) resultSet.getObject(2);
There are a couple of issues:
1. Internally, the driver passes incoming tuples around as byte[][]
instances, which doesn't leave much ability to do something else with the
incoming data. I've submitted a PR [3] that introduces a Tuple class as a
wrapper to pass around, which then allows us to do more interesting things
with the data.
2. How should we register the reader? We have to do it ahead of execution
of the query, as the driver has already read at least some data rows by the
time we return the ResultSet.
Some potential options are:
a) Register against the statement and use it for all columns of binary
type. This would look like the above.
b) Register against the statement but for individual columns:
statement.registerByteStreamReader(2, new MyCustomByteStreamReader());
statement.registerByteStreamReader("foo", new MyCustomByteStreamReader());
c) Mark incoming columns in some other way that the driver can recognise.
This requires getting creative. An example would be to create a domain over
the bytea type and then register the reader for that type. Then queries
would have to have results cast to that type.
d) Register a higher-level object like the connection or driver and use it
for all columns of binary type.
Option a) is the simplest in that it neither requires us to keep track of
readers for individual columns nor requires users having to mess with their
database schema. It's definitely enough for my use-case, but I'm interested
in hearing other opinions on whether that's flexible enough.
Is there general support for the feature? I'm again happy to code up a PR
and have time allocated to do that fairly soon if there's likelihood of it
being merged.
Thanks
Tom
[1]
https://www.postgresql.org/message-id/C659D6A4-430F-4F55-BE06-BE1C960A5405%40tomd.cc
[2] https://github.com/pgjdbc/pgjdbc/pull/953
[3] https://github.com/pgjdbc/pgjdbc/pull/954
From | Date | Subject | |
---|---|---|---|
Next Message | Yason TR | 2017-09-21 13:50:44 | JDBC: logical replication and LSN feedback |
Previous Message | Yason TR | 2017-09-20 13:01:19 | Re: [GENERAL] JDBC: logical replication and LSN feedback |