BUG #17948: libpq seems to misbehave in a pipelining corner case

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: i(dot)trofimow(at)yandex(dot)ru
Subject: BUG #17948: libpq seems to misbehave in a pipelining corner case
Date: 2023-05-26 17:27:39
Message-ID: 17948-fcace7557e449957@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17948
Logged by: Ivan Trofimov
Email address: i(dot)trofimow(at)yandex(dot)ru
PostgreSQL version: 15.3
Operating system: Ubuntu 20.04
Description:

As far as i understand, there is an invariant in libpq, that if
PQpipelineSync call finished successfully,
PQresultStatus will eventually return PGRES_PIPELINE_SYNC, or the connection
will be in CONNECTION_BAD state.

This is highlighted in several places in the docs:
1. "PGRES_PIPELINE_SYNC is reported exactly once for each PQpipelineSync at
the corresponding point in the pipeline."
2. "From the client's perspective, after PQresultStatus returns
PGRES_FATAL_ERROR, the pipeline is flagged as aborted.
PQresultStatus will report a PGRES_PIPELINE_ABORTED result for each
remaining queued operation in an aborted pipeline.
The result for PQpipelineSync is reported as PGRES_PIPELINE_SYNC to signal
the end of the aborted pipeline and
resumption of normal result processing. The client must process results with
PQgetResult during error recovery."

So i expect a code like this

if (!PQpipelineSync(conn)) exit(1);

while (PQstatus(conn) != CONNECTION_BAD) {
res = PQgetResult(conn);
if (PQresultStatus(res) == PGRES_PIPELINE_SYNC) {
break;
}
}

to eventually exit the loop.

However, if instead of expected "ReadyToQuery" response server sends an
error and closes the connection (
say, backend terminated by administrator or with a proxy in place and
upstream PG server being dead) this loop
gets stuck in busy-loop.

My understanding is that transitions in pipelining state-machine go like
this:
1. Initial PQgetResult call gets here
https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#L2081
2. parseInput gets an error the server send
https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-protocol3.c#L216

and switches the state into PGASYNC_READY
3. PQgetResult continues and here
https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#L2126
advances the queue, so the Sync entry is gone, and switches into
PGASYNC_PIPELINE_IDLE later.
4. Next PQgetResult call gets here
https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#LL2110C4-L2110C26,
and pqPipelineProcessQueue switches into PGASYNC_IDLE here
https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#L3084

Now we are stuck in the position where libpq considers the connection being
in CONNECTION_OK state
(because no reads over half-closed socket have been issues),
asyncStatus being PGASYNC_IDLE, pipeline being in PQ_PIPELINE_ABORTED state,

and the expected PGRES_PIPELINE_SYNC never came and never will (because
PGASYNC_IDLE, so PQgetResult returns NULL right away).

I am able to reproduce this behavior reliably via
https://pastebin.com/raw/4j3v3QzC,
linking against libpq5=15.3 and running the program against PostgreSQL
15.2.

Who is at fault here, is it libpq or me misunderstanding/misusing libpq?

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2023-05-28 12:26:31 BUG #17949: Adding an index introduces serialisation anomalies.
Previous Message Tom Lane 2023-05-26 13:51:09 Re: Comparing date strings with jsonpath expression