Re: [PATCH] plpythonu datatype conversion improvements

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Caleb Welton <cwelton(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] plpythonu datatype conversion improvements
Date: 2009-08-22 11:35:48
Message-ID: 407d949e0908220435v5ba7c72erb4290d7cc0f4cdab@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 22, 2009 at 11:45 AM, Caleb Welton<cwelton(at)greenplum(dot)com> wrote:
> As documented in the patch, the primary motivation was support of BYTEA
> datatype, which when cast through cstring was truncating python strings with
> embedded nulls,
> performance was only a secondary consideration.

The alternative to attaching to the internal representation would be
to marshal and unmarshal the text representation where nuls are
escaped as \000.

However I dispute this this is "micro-performance" that we're talking
about. On any given small datum it may be a small incremental amount
of time but it's not incremental time that matters, it's aggregate. If
you're processing 1TB of data and you have to marshal and unmarshal
all 1TB it doesn't matter that you're doing it in 100 byte chunks. And
in any case there are plenty of people throwing around multi-megabyte
bytea blobs and having to marshal and unmarshal them every time they
go from the database into a PL or back would be a noticeable delay and
risk of out-of-memory errors.

If we want PLs to not be overly in bed with Postgres data types then
the way to do it is to have data types provide abstract methods for
accessing their internals. At least for bytea and text that would be
fairly straightforward. For numeric I don't see that it would really
buy much since it wouldn't really let us completely change
representations.

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-08-22 13:43:31 Re: Lazy Snapshots
Previous Message Mark Cave-Ayland 2009-08-22 11:14:48 Re: Another try at reducing repeated detoast work for PostGIS