libpq seems to misbehave in a pipelining corner case

From: Ivan Trofimov <itrofimow(at)userver(dot)tech>
To: "pgsql-interfaces(at)lists(dot)postgresql(dot)org" <pgsql-interfaces(at)lists(dot)postgresql(dot)org>
Subject: libpq seems to misbehave in a pipelining corner case
Date: 2023-05-24 12:05:41
Message-ID: 31441684929791@mail.yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces



As far as i understand, there is an invariant in libpq, that if PQpipelineSync call finished successfully,


PQresultStatus will eventually return PGRES_PIPELINE_SYNC, or the connection will be in CONNECTION_BAD state.
 

This is highlighted in several places in the docs:


1. "PGRES_PIPELINE_SYNC is reported exactly once for each PQpipelineSync at the corresponding point in the pipeline."


2. "From the client's perspective, after PQresultStatus returns PGRES_FATAL_ERROR, the pipeline is flagged as aborted. 


PQresultStatus will report a PGRES_PIPELINE_ABORTED result for each remaining queued operation in an aborted pipeline. 


The result for PQpipelineSync is reported as PGRES_PIPELINE_SYNC to signal the end of the aborted pipeline and 


resumption of normal result processing. The client must process results with PQgetResult during error recovery."
 

So i expect a code like this
 

if (!PQpipelineSync(conn)) exit(1);
 

while (PQstatus(conn) != CONNECTION_BAD) {


   res = PQgetResult(conn);


   if (PQresultStatus(res) == PGRES_PIPELINE_SYNC) {


       break;


   }


}
 

to eventually exit the loop.
 

However, if instead of expected "ReadyToQuery" response server sends an error and closes the connection (


say, backend terminated by administrator or with a proxy in place and upstream PG server being dead) this loop


gets stuck in busy-loop.
 

My understanding is that transitions in pipelining state-machine go like this:


1. Initial PQgetResult call gets here https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#L2081


2. parseInput gets an error the server send https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-protocol3.c#L216 


and switches the state into PGASYNC_READY


3. PQgetResult continues and here https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#L2126


advances the queue, so the Sync entry is gone, and switches into PGASYNC_PIPELINE_IDLE later.


4. Next PQgetResult call gets here https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#LL2110C4-L2110C26,


and pqPipelineProcessQueue switches into PGASYNC_IDLE here https://github.com/postgres/postgres/blob/a817edbf6f302c376f5c0012d19a0474b6bdea88/src/interfaces/libpq/fe-exec.c#L3084
 

Now we are stuck in the position where libpq considers the connection being in CONNECTION_OK state


(because no reads/writes over half-closed socket have been issues),


asyncStatus being PGASYNC_IDLE, pipeline being in PQ_PIPELINE_ABORTED state,


and the expected PGRES_PIPELINE_SYNC never came and never will (because PGASYNC_IDLE).
 

Who is at fault here, is it postgres, libpq or me misunderstanding/misusing libpq?

Attachment Content-Type Size
unknown_filename text/html 4.6 KB
libpq_pipelining_busyloop.c text/x-c 1.8 KB

Browse pgsql-interfaces by date

  From Date Subject
Next Message Timothy Nelson 2023-10-16 21:53:59 Like AppSmith, but dedicated to Postgres?
Previous Message Tom Lane 2023-04-15 17:31:55 Re: C trigger significantly slower than PL/pgSQL?