Quick Links

Re: Speed dblink using alternate libpq tuple storage

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Marko Kreen <markokr(at)gmail(dot)com>
Cc:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject:	Re: Speed dblink using alternate libpq tuple storage
Date:	2012-04-01 21:51:19
Message-ID:	28118.1333317079@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I've been thinking some more about the early-termination cases (where
the row processor returns zero or longjmps), and I believe I have got
proposals that smooth off most of the rough edges there.

First off, returning zero is really pretty unsafe as it stands, because
it only works more-or-less-sanely if the connection is being used in
async style. If the row processor returns zero within a regular
PQgetResult call, that will cause PQgetResult to block waiting for more
input. Which might not be forthcoming, if we're in the last bufferload
of a query response from the server. Even in the async case, I think
it's a bad design to have PQisBusy return true when the row processor
requested stoppage. In that situation, there is work available for the
outer application code to do, whereas normally PQisBusy == true means
we're still waiting for the server.

I think we can fix this by introducing a new PQresultStatus, called say
PGRES_SUSPENDED, and having PQgetResult return an empty PGresult with
status PGRES_SUSPENDED after the row processor has returned zero.
Internally, there'd also be a new asyncStatus PGASYNC_SUSPENDED,
which we'd set before exiting from the getAnotherTuple call. This would
cause PQisBusy and PQgetResult to do the right things. In PQgetResult,
we'd switch back to PGASYNC_BUSY state after producing a PGRES_SUSPENDED
result, so that subsequent calls would resume parsing input.

With this design, a suspending row processor can be used safely in
either async or non-async mode. It does cost an extra PGresult creation
and deletion per cycle, but that's not much more than a malloc and free.

Also, we can document that a longjmp out of the row processor leaves the
library in the same state as if the row processor had returned zero and
a PGRES_SUSPENDED result had been returned to the application; which
will be a true statement in all cases, sync or async.

I also mentioned earlier that I wasn't too thrilled with the design of
PQskipResult; in particular that it would encourage application writers
to miss server-sent error results, which would inevitably be a bad idea.
I think what we ought to do is define (and implement) it as being
exactly equivalent to PQgetResult, except that it temporarily installs
a dummy row processor so that data rows are discarded rather than
accumulated. Then, the documented way to clean up after deciding to
abandon a suspended query will be to do PQskipResult until it returns
null, paying normal attention to any result statuses other than
PGRES_TUPLES_OK. This is still not terribly helpful for async-mode
applications, but what they'd probably end up doing is installing their
own dummy row processors and then flushing results as part of their
normal outer loop. The only thing we could do for them is to expose
a dummy row processor, which seems barely worthwhile given that it's
a one-line function.

I remain of the opinion that PQgetRow/PQrecvRow aren't adding much
usability-wise.

regards, tom lane

In response to

Re: Speed dblink using alternate libpq tuple storage at 2012-03-31 15:50:27 from Tom Lane

Responses

Re: Speed dblink using alternate libpq tuple storage at 2012-04-01 23:03:27 from Marko Kreen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Stark	2012-04-01 22:12:05	Re: measuring lwlock-related latency spikes
Previous Message	Simon Riggs	2012-04-01 21:27:21	Re: measuring lwlock-related latency spikes