Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Daniel Farina <drfarina(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Farina <dfarina(at)truviso(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Date: 2009-11-24 13:20:57
Message-ID: 1259068857.30357.104.camel@hvost1700
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2009-11-24 at 03:48 -0800, Daniel Farina wrote:
> On Tue, Nov 24, 2009 at 3:25 AM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
> > On Tue, 2009-11-24 at 02:56 -0800, Daniel Farina wrote:
> >> On Tue, Nov 24, 2009 at 2:51 AM, Daniel Farina <drfarina(at)gmail(dot)com> wrote:
> >> > On Tue, Nov 24, 2009 at 2:50 AM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
> >> >> Can't you use existing aggregate function design ?
> >> >>
> >> >> CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
> >> >> SFUNC = sfunc,
> >> >> STYPE = state_data_type
> >> >> [ , FINALFUNC = ffunc ]
> >> >> [ , INITCOND = initial_condition ]
> >> >> [ , SORTOP = sort_operator ]
> >> >> )
> >> >
> >> > Actually, yes. I just thought that this was an idea so crazy that no
> >> > one would like it.
> >
> > seems kind of natural choice for me - in essence this is an aggregate
> > function, aggregating over rows/tuples supplied to it.
>
> Okay, well, maybe that wasn't such a crazy idea after all...
>
> >> Oh, and the other elephant in the room: error handling. How to handle
> >> error conditions...try/catch/finally type stuff.
> >
> > Same as current aggregates - either ignore the error, logi it and
> > continue, or bail out
> >[snip]
> > Neither do ordinary funtions, we have no "ON ERROR DO ..." clause for
> > function definitions
>
> It is assumed most functions do not have side effects outside the
> database, so this is gotten rather for free. The driving use case for
> this *is* side effects on other systems. I'm not sure if it's as easy
> to use this justification here...normally rollbacks just take care of
> all the error handling a function would want. Here I'm not so sure
> that is as common a case.

A cleaner solution for undoing external effects would be ON ROLLBACK
trigger, or maybe even extension to BEGIN

BEGIN WORK ON ROLLBACK RUN externalCleanupFunction();

ROLLBACK trigger could also be done as SET parameter inside a session,
so it wont bloat/pollute system tables if changed often;

> >
> >> I did consider using
> >> aggregates though, but somehow it felt to me like "I need at least a
> >> three-tuple, why not fish around for any random bundling of three
> >> functions..."
> >
> > Why do you need three ?
>
> I'm counting the aggregate prototype itself to refer to the bundle,
> which I suppose would be more normally considered a two-tuple of
> functions. This is a self-referential tuple, I suppose...
>
> >> After all, I would not want to actually call the nodeAgg stuff to
> >> apply the function anyway...so it'd basically be abused as a
> >> three-tuple of functions.
> >
> > Actually it would be best if it could use straight generic funtions, so
> > you could do something like
> >
> > COPY stdin TO filterfunc(int) TO avg(int);
>
> Generic functions? Do you mean just scalar functions?

Type. Actually I meant our existing aggregate functions.

> That'd be
> neat, but as I said previously, composition could just be wrapped into
> a function of the user's choice. Also, what about use of
> multi-function-apply?
>
> COPY stdin TO replicant1(datum) AND replicant2(datum);

seems like a rare case, but you could use a wrapper func

CREATE FUNCTION replicants_1_and_2(datum) AS
replicant1(datum)
replicant2(datum)

> You could imagine all sorts of new 2PC evil.

2PC is evil enyway, at least when performance is concerned ;)

> But again, one could
> just write a little function to absorb the rows and dole them out
> without bloating COPY syntax...
>
> I am in no way suggesting that syntax seriously or unseriously.
>
> > pass the file name in as an argument to SFUNC, open it on first call,
> > ignore later (if it stays the same ;)
>
> So either you are going to pass it with every row and ignore it,

That would be my preferred way, yes

> or create a new initial aggregate state for each COPY TO FUNCTION

third, more hackish way would to set it as INITCOND = '/file/name' :)

> ...how are you going to get it passed to SFUNC?

keep the file handle in the aggregate node - it is for keeping state,
and file handle sure is part of state.

--
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2009-11-24 13:39:21 Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Previous Message Daniel Farina 2009-11-24 13:00:30 Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION