Re: Consider pipeline implicit transaction as a transaction block

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Cc: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Consider pipeline implicit transaction as a transaction block
Date: 2024-11-25 06:39:39
Message-ID: Z0Qbq5q9cbSNg7nc@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 20, 2024 at 06:03:12PM +0100, Anthonin Bonnefoy wrote:
> 0001: This is a small bug I've stumbled upon. The query buffer is not
> cleared on a backslash error. For example, "SELECT 1 \parse" would
> fail due to a missing statement name but would leave "SELECT 1\n" in
> the query buffer which would impact the next command.

This breaks an existing property of psql. One example: \edit where we
should keep the existing query buffer rather than discard it if a
failure happens. This patch forcibly removes the contents of the
query buffer, and a failure could happen because of an incorrect
option given to this meta-command.

> 0002: This adds pipeline support to psql by adding the same
> meta-commands as pgbench: \startpipeline, \endpipeline and
> \syncpipeline. Once a pipeline is started, all extended protocol
> queries are piped until \endpipeline is called. To keep things simple,
> meta-commands like \gx, \gdesc and \gexec are forbidden inside a
> pipeline. The tests are covering the existing pipelining behaviour
> with the SET LOCAL, SAVEPOINTS, REINDEX CONCURRENTLY and LOCK
> commands, depending if it's the first command or not. The
> \syncpipeline allows us to see that this "first command" behaviour
> also happens after a synchronisation point within a pipeline.

That's pretty cool. I've wanted that myself.

This relies on 0001 for the case where \gx fails and you want to make
sure that the follow-up command is able to work in the same pipeline.
I'd suggest to add a \reset after the \gx in this test sequence,
without 0001. At this point we have IMO more advantages in
maintaining the existing properties rather than enforce it, especially
as we have \reset.

+ /*
+ * In pipeline mode, a NULL result is returned to notify the
+ * next query is being processed. We need to consume it and
+ * get the next result.
+ */
+ result = PQgetResult(pset.db);
+ Assert(result == NULL);
+ result = PQgetResult(pset.db);
[...]
next_result = PQgetResult(pset.db);
+ if (process_pipeline && result_status != PGRES_PIPELINE_SYNC)
+ {
+ /*
+ * In pipeline mode, a NULL result indicates the end of the
+ * current query being processed. We need to call PQgetResult a
+ * second time to move to the next response.
+ */
+ Assert(next_result == NULL);
+ next_result = PQgetResult(pset.db);
+ }

I see. This business in ExecQueryAndProcessResults() is consistent
with what pgbench does in discardUntilSync() because of the outer
loop. I found a bit confusing that you had to do this step twice.
Perhaps this could use better comments, but I am not sure what to use
here..

> 0003: The initial change to use implicit transaction block when
> pipelining is detected. The tests reflect the change in behaviour like
> the LOCK command working if it's the second command in the pipeline.
> I've updated the pipeline documentation to provide more details about
> the use of implicit transaction blocks with pipelines.

+-- \startpipeline, \endpipeline, \syncpipeline, \bind and \g are psql specific meta-commands
+\startpipeline

Why is stuff specific to psql being mentioned on the libpq page? That
does not seem adapted to me here. Documenting the expectations and
the assumptions one would rely on in the context of a pipeline and
what the backend thinks of it as an internal transaction is a good
idea.

It is sad to not have at least 2~3 tests that we could use for the
back-branches for the ERROR cases and not the first command cases with
minimal sequences in pgbench? I'm OK with your proposal with psql on
HEAD, but backpatching without a few tests is going to make the
maintenance of stable branches more complicated in the long run as we
would navigate blindly. So I'd try to fix the problems first, then
work on any new proposal once we are in a clean operational state.
And I'd recommend to propose the new feature on a new, separate,
thread to attract a better audience for the reviews this would
require.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-11-25 07:08:41 Re: POC, WIP: OR-clause support for indexes
Previous Message Richard Guo 2024-11-25 06:27:57 Re: POC, WIP: OR-clause support for indexes