Quick Links

Re: Speed dblink using alternate libpq tuple storage

From:	Marko Kreen <markokr(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject:	Re: Speed dblink using alternate libpq tuple storage
Date:	2012-04-03 18:47:40
Message-ID:	20120403184740.GA6737@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Apr 01, 2012 at 07:23:06PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > So the proper approach would be to have new API call, designed to
> > handle it, and allow early-exit only from there.
>
> > That would also avoid any breakage of old APIs. Also it would avoid
> > any accidental data loss, if the user code does not have exactly
> > right sequence of calls.
>
> > How about PQisBusy2(), which returns '2' when early-exit is requested?
> > Please suggest something better...
>
> My proposal is way better than that. You apparently aren't absorbing my
> point, which is that making this behavior unusable with every existing
> API (whether intentionally or by oversight) isn't an improvement.
> The row processor needs to be able to do this *without* assuming a
> particular usage style,

I don't get what kind of usage scenario you think of when you
"early-exit without assuming anything about upper-level usage."

Could you show example code where it is useful?

The fact remains that upper-level code must cooperate with callback.
Why is it useful to hijack PQgetResult() to do so? Especially as the
PGresult it returns is not useful in any way and the callback still needs
to use side channel to pass actual values to upper level.

IMHO it's much better to remove the concept of early-exit from public
API completely and instead give "get" style API that does the early-exit
internally. See below for example.

> and most particularly it should not force people
> to use async mode.

Seems our concept of "async mode" is different. I associate
PQisnonblocking() with it. And old code, eg. PQrecvRow()
works fine in both modes.

> An alternative that I'd prefer to that one is to get rid of the
> suspension return mode altogether.

That might be good idea. I would prefer to postpone controversial
features instead having hurried design for them.

Also - is there really need to make callback API ready for *all*
potential usage scenarios? IMHO it would be better to make sure
it's good enough that higher-level and easier to use APIs can
be built on top of it. And thats it.

> However, that leaves us needing
> to document what it means to longjmp out of a row processor without
> having any comparable API concept, so I don't really find it better.

Why do you think any new API concept is needed? Why is following rule
not enough (for sync connection):

"After exception you need to call PQgetResult() / PQfinish()".

Only thing that needs fixing is storing lastResult under PGconn
to support exceptions for PQexec(). (If we need to support
exceptions everywhere.)

---------------

Again, note that if we would provide PQrecvRow()-style API, then we *don't*
need early-exit in callback API. Nor exceptions...

If current PQrecvRow() / PQgetRow() are too bloated for you, then how about
thin wrapper around PQisBusy():

/* 0 - need more data, 1 - have result, 2 - have row */
int PQhasRowOrResult(PGconn *conn, PGresult **hdr, PGrowValue **cols)
{
int gotrow = 0;
PQrowProcessor oldproc;
void *oldarg;
int busy;

/* call PQisBusy with our callback */
oldproc = PQgetRowProcessor(conn, &oldarg);
PQsetRowProcessor(conn, hasrow_cb, &flag);
busy = PQisBusy(conn);
PQsetRowProcessor(conn, oldproc, oldarg);

if (gotrow)
{
*hdr = conn->result;
*cols = conn->rowBuf;
return 2;
}
return busy ? 0 : 1;
}

static int hasrow_cb(PGresult *res, PGrowValue *columns, void *param)
{
int *gotrow = param;
*gotrow = 1;
return 0;
}

Also, instead hasrow_cb(), we could have integer flag under PGconn that
getAnotherTuple() checks, thus getting rid if it even in internal
callback API.

Yes, it requires async-style usage pattern, but works for both
sync and async connections. And it can be used to build even
simpler API on top of it.

Summary: it would be better to keep "early-exit" internal detail,
as the actual feature needed is processing rows outside of callback.
So why not provide API for it?

--
marko

In response to

Re: Speed dblink using alternate libpq tuple storage at 2012-04-01 23:23:06 from Tom Lane

Responses

Re: Speed dblink using alternate libpq tuple storage at 2012-04-03 21:32:25 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2012-04-03 18:58:25	Re: invalid search_path complaints
Previous Message	Peter Geoghegan	2012-04-03 18:22:57	Re: invalid search_path complaints