Re: Add Pipelining support in psql

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Cc: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add Pipelining support in psql
Date: 2025-02-18 07:22:53
Message-ID: Z7Q1TTtOxYEiNn0h@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 14, 2025 at 09:49:02AM +0100, Anthonin Bonnefoy wrote:
> During my tests, I've noticed I didn't handle query cancellation, this
> is now fixed. I've also added additional comments related to
> available_results to make it clearer that it depends on what the
> server has flushed to the client.

This is a pretty cool patch. I like the structure you have used for
the integration with the tracking of the number of commands, the
number of syncs (like pgbench) put in a pipeline, the number of
results requested and the number of results available. That makes the
whole easier to look at with a state in pset.

+ PSQL_SEND_PIPELINE_SYNC,
+ PSQL_START_PIPELINE_MODE,
+ PSQL_END_PIPELINE_MODE,
+ PSQL_FLUSH,
+ PSQL_SEND_FLUSH_REQUEST,
+ PSQL_GET_RESULTS,

These new values are inconsistent, let's use some more PSQL_SEND_*
here. That makes the whole set of values more greppable.

The tests in psql.sql are becoming really long. Perhaps it would be
better to split that into its own file, say psql_pipeline.sql? The
input file is already 2k lines, you are adding 15% more lines to that.

+ * otherwise, calling PQgetResults will block.

Likely PQgetResults => PQgetResult().

Wondering if the cancellation in ExecQueryAndProcessResults() is
sound, I've not been able to break it, still..

I can also get behind the additions of \flush and \flushrequest to
query different parts of libpq.

+ if (pset.requested_results == 0)
+ /* We've read all requested results, exit */
+ return res;

Set of nits with the style of the code, but I'd suggest to use
braces {} here, to outline that the comment is in a block. There's a
second, larger one in discardAbortedPipelineResults().

+ if (strcmp(cmd, "gx") == 0 && PQpipelineStatus(pset.db) != PQ_PIPELINE_OFF)
+ {
+ pg_log_error("\\gx not allowed in pipeline mode");
+ clean_extended_state();
+ return PSQL_CMD_ERROR;
+ }

What is the reasoning here behind this restriction? \gx is a wrapper
of \g with expanded mode on, but it is also possible to call \g with
expanded=on, bypassing this restriction.

Let's split the prompt patch with the support of %P into its own
patch.

-#define DEFAULT_PROMPT1 "%/%R%x%# "
-#define DEFAULT_PROMPT2 "%/%R%x%# "
+#define DEFAULT_PROMPT1 "%/%R%P%x%# "
+#define DEFAULT_PROMPT2 "%/%R%P%x%# "
#define DEFAULT_PROMPT3 ">> "

I don't think that changing this default is a good idea. Everybody
can do that in their own .psqlrc (spoiler: I do).

The idea to use three fields with a hardcoded format does not look
like a good idea to me. I think that this should be done in a
different and more extensible way:
- Use %P to track if we are in pipeline mode on, off or abort.
- Define three new variables that behave like ROW_COUNT, but for the
fields you want to track here. These could then be passed down to a
PROMPT with %:name:, grammar already supported.

That would make the whole much more flexible. At it seems to me that
we could also add requested_results to this set? These could be named
with the same prefix, like PIPELINE_SYNC_COUNT, etc.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2025-02-18 07:24:02 Re: Removing unneeded self joins
Previous Message Nazir Bilal Yavuz 2025-02-18 07:20:12 Re: read stream on amcheck