Quick Links

Re: Speed dblink using alternate libpq tuple storage

From:	Marko Kreen <markokr(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject:	Re: Speed dblink using alternate libpq tuple storage
Date:	2012-04-01 23:03:27
Message-ID:	20120401230327.GA24637@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Apr 01, 2012 at 05:51:19PM -0400, Tom Lane wrote:
> I've been thinking some more about the early-termination cases (where
> the row processor returns zero or longjmps), and I believe I have got
> proposals that smooth off most of the rough edges there.
>
> First off, returning zero is really pretty unsafe as it stands, because
> it only works more-or-less-sanely if the connection is being used in
> async style. If the row processor returns zero within a regular
> PQgetResult call, that will cause PQgetResult to block waiting for more
> input. Which might not be forthcoming, if we're in the last bufferload
> of a query response from the server. Even in the async case, I think
> it's a bad design to have PQisBusy return true when the row processor
> requested stoppage. In that situation, there is work available for the
> outer application code to do, whereas normally PQisBusy == true means
> we're still waiting for the server.
>
> I think we can fix this by introducing a new PQresultStatus, called say
> PGRES_SUSPENDED, and having PQgetResult return an empty PGresult with
> status PGRES_SUSPENDED after the row processor has returned zero.
> Internally, there'd also be a new asyncStatus PGASYNC_SUSPENDED,
> which we'd set before exiting from the getAnotherTuple call. This would
> cause PQisBusy and PQgetResult to do the right things. In PQgetResult,
> we'd switch back to PGASYNC_BUSY state after producing a PGRES_SUSPENDED
> result, so that subsequent calls would resume parsing input.
>
> With this design, a suspending row processor can be used safely in
> either async or non-async mode. It does cost an extra PGresult creation
> and deletion per cycle, but that's not much more than a malloc and free.

I added extra magic to PQisBusy(), you are adding extra magic to
PQgetResult(). Not much difference.

Seems we both lost sight of actual usage scenario for the early-exit
logic - that both callback and upper-level code *must* cooperate
for it to be useful. Instead, we designed API for non-cooperating case,
which is wrong.

So the proper approach would be to have new API call, designed to
handle it, and allow early-exit only from there.

That would also avoid any breakage of old APIs. Also it would avoid
any accidental data loss, if the user code does not have exactly
right sequence of calls.

How about PQisBusy2(), which returns '2' when early-exit is requested?
Please suggest something better...

--
marko

In response to

Re: Speed dblink using alternate libpq tuple storage at 2012-04-01 21:51:19 from Tom Lane

Responses

Re: Speed dblink using alternate libpq tuple storage at 2012-04-01 23:23:06 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2012-04-01 23:23:06	Re: Speed dblink using alternate libpq tuple storage
Previous Message	Greg Stark	2012-04-01 23:00:52	Re: measuring lwlock-related latency spikes