[Pljava-dev] VarLenTuple example code

From: schabi at logix-tt(dot)com (Markus Schaber)
To:
Subject: [Pljava-dev] VarLenTuple example code
Date: 2006-09-29 15:08:50
Message-ID: 451D3702.5070802@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pljava-dev

Hi, Thomas,

Thomas Hallgren wrote:

> You are saying that two mechanisms are not enough. We need:
>
> The text (canonical) to disc conversion (the current input/output)
> The external binary (canonical) format (used in various protocols) to
> disc conversion (the send/receive).

Yes, correct.

> A way to create Java objects from canonical binary format.
> A way to create the canonical binary format from Java objects.

No, I think we should convert between java objects and the disc format.
The disc format (aka C, "internal") is what we get handed when
PostgreSQL calls SQL functions, and what we have to give back to
PostgreSQL as result (and when calling via SPI). I don't see any value
in the C->B->D roundtrip.

Conversions between C and D are the most common ones, so they should be
fast.

> Then it all falls into place. This is what I think needs to be done:
>
> 1. We need two new methods. One that does the C -> B conversion and one
> that does the B -> C. I.e. verbatim Java implementations of the
> send/receive. Those methods should be static since this is a byte[] to
> byte[] conversion only.

Yes. That's what I wanted to implement with my static receive and send
methods in the VarLenTuple class.

> 2. The readSQL methods should be called with an SQLInput that wraps the
> output of the C -> B converter.
> 3. The data written by the writeSQL method should be passed to the B ->
> C converter.

No, the SQLInput/SQLoutput should wrap the on-disk format directly (what
it does now, AFAICS) for datatype mappings, without calling any converters.

> And of course, Java really comes into play when a COPY is performed
> since the B -> C and C ->B must be executed.

Yes.

[Warning: Direct brain-dump follows. Anticipate confusion.]

So my idea is:

- Use the readSQL/writeSQL format to read and write C (internal / disk
format) for the "type mapping". This means all the example code using
type mapping should continue to work, no code wr/t this has to be changed.

- Leave the UDT[] magic for input/output mapping as it is now, as it
works fine, and using the toString() semantics is nice.

- Drop the UDT[] magic for send/receive, and allow the users to define
them as "normal" static methods.

The only problem when implementing the send/receive functions as static
methods was that the pseudo type "internal" parameter is not mapped
usefully. In my eyes, mapping the "internal" parameter from receive
(which is an StringInfo in PG_GETARG_POINTER(0) internally) to an
SQLInput object should be the easiest way to solve that problem.

(It's a bit misleading to call that pseudo type "internal" despite the
fact that it carries the external representation in this case. One might
think the PostgreSQL core hackers celebrate some kind of cynism. :-)

This way, it's possible to implement both send() and receive() as static
methods in plain java, each implementing one way of a clean conversion
between the Java object and the serialized, external B form.

Actually, this are C->D->B and B->D->C conversions. The C->D and D->B is
handled by the existing type mapping. Just look how my VarLenTuple is
coded, that's what I have in mind.

While meditating, I came to the following conclusion: AFAICS, there's
already a mapping in place for CString to Java Strings, so input/output
functions seem to be implementable with static methods instead of UDT[]
magic, if wanted. And keeping the UDT[] magic code in place for send/
receive will not hurt either, so users can use it when they think it
fits their needs (having equal B and C formats).

Advanced optimization:
For efficiency reasons, it might be useful to have send() and receive()
convert directly between C and B, without the intermediate D step. This
could be made by having both methods working on a pair of SQLInput/
SQLOutput, but we'd need some additional magic to be able to declare
such methods, I think. Maybe we could introduce some declaration that
allows a function to receive and send SQLInput/SQLOutput directly,
without calling the type mapping code, thus they could work with the
on-disk representation directly, a kinda "high speed" path. But I'm
afraid that would be highly PostgreSQL specific and not portable in any way.

Phew.
I think I'll need another night or two to think about all that, but at
least at the moment, it seems to make sense for me. :-)

Thanks,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

In response to

Responses

Browse pljava-dev by date

  From Date Subject
Next Message Lyle Giese 2006-09-29 16:59:27 [Pljava-dev] JNI_CreateJavaVM
Previous Message Thomas Hallgren 2006-09-29 14:06:01 [Pljava-dev] VarLenTuple example code