Enabling parallel execution for cursors

From: Meftun Cincioğlu <meftun(dot)cincioglu(at)localus(dot)com(dot)tr>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Enabling parallel execution for cursors
Date: 2024-08-12 08:55:17
Message-ID: CAAeyror5bd6J6VFj4HR5zcWf_SLyfVNe53RtG4-A0jT+gZxdSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I believe the topic on the link below
https://www.postgresql.org/message-id/flat/98760523-a0d2-8705-38e3-c4602ecf2448%402ndquadrant.com
should be re-discussed, as more than 6 years passed from the last message.
Let me summarize the topic first and ask the real question: Can declared
cursors be developed that support parallelism?

[1] Tomas Vondra stated that cursors generally do not benefit from parallel
query. There are two main reasons why parallelism is disabled for
user-defined cursors (or queries that might get suspended):
(a) We can't predict what will happen while the query is suspended (and the
transaction is still in "parallel mode"), e.g. the user might run arbitrary
DML which is not allowed.
(b) If the cursor gets suspended, the parallel workers would be still
assigned to it and could not be used for anything else.
He proposed to add a new cursor option (PARALLEL), which would allow
parallel plans for that particular user-defined cursor. He also attached
the first version of the patch and indicated that the patch is experimental.

[2] Robert Haas stated that the proposed patch is a small portion of what
should be done to cover all the holes.

[3] Tomas Vondra stated that it should be checked at planning time if the
transaction is in parallel mode and the query contains unsafe/restricted
functions.

[4] Robert Haas said;
The main points I want to make clearly understood is the current design
relies on (1) functions being labeled correctly and (2) other dangerous
code paths being unreachable because there's nothing that runs between
EnterParallelMode and ExitParallelMode which could invoke them, except by
calling a mislabeled function. Your patch expands the vulnerability
surface from "executor code that can be reached without calling a
mislabeled function" to "any code that can be reached by typing an SQL
command". Just rejecting any queries that are parallel-unsafe probably
closes a good chunk of the holes, but that still leaves a lot of code
that's never been run in parallel mode before potentially now running in
parallel mode. Error handling might be a problem, too: what happens if a
parallel worker is killed while the query is suspended?
He also proposed to write a test patch that keeps parallel mode active at
all times except when running a query that contains something
parallel-unsafe.

[5] Tomas Vondra proposed a simpler option - what if we only allow fetches
from the PARALLEL cursor while the cursor is open?
BEGIN;
...
DECLARE x PARALLEL CURSOR FOR SELECT * FROM t2 WHERE ...;
FETCH 1000 FROM x;
FETCH 1000 FROM x;
FETCH 1000 FROM x;
CLOSE x;
...
COMMIT;
but adding any other command between the OPEN/CLOSE commands would fail.
That should close all the holes with parallel-unsafe stuff, right?
Of course, this won't solve the issue with error handling / killing
suspended workers (which didn't occur to me before as a possible issue at
all, so that's for pointing that out). But that's a significantly more
limited issue to fix than all the parallel-unsafe bits.

[6] Robert Haas asked some questions about the suggestion that Tomas stated
and also made a comment about the materialization issue that Craig stated.
He also said; If you're running a query like "SELECT * FROM bigtable", you
don't need parallel query in the first place, because a single backend is
quite capable of sending back the rows as fast as a client can read them.
If you're running a query like "SELECT * FROM bigtable WHERE <highly
selective predicate>" then that's a good use case for parallel query, but
then materializing it isn't that bad because the result set is a lot
smaller than the original table.

[7] Simon Riggs said;
At present, one major use of Cursors is in postgres_fdw. In that usage, the
query executes and returns all the rows. No other side execution is
possible. How do we make parallel query work for Cursors, if not by Tomas'
proposal? If a parallel cursor is requested, we could simply prevent other
intermediate commands other than FETCH (next).

[8] Robert Haas proposed providing some kind of infrastructure for workers
to detach from a parallel query. Let's say the leader is either (a)
suspending execution of the query, because it's a cursor, or (b) not able
to absorb rows as fast as workers are generating them.
In the former situation, we'd like to get rid of all workers; in the latter
situation, some workers. In the former situation, getting all workers to
shut down cleanly would let us exit parallel mode (and perhaps re-enter it
later when we resume execution of the query). In the latter situation, we
could avoid wasting workers on queries where the leader can't keep up so
that those worker slots are available to other queries that can benefit
from them.
He stated that there were some problems finding a point at which tuples
could be safely stopped from being returned.
He considered two approaches for handling parallel workers when a query is
suspended:
(a) Keep Workers Running: This would involve keeping workers active during
suspension, but it requires complex state synchronization and could tie up
resources for long periods.
(b) Restrict Backend Activity: Imposing strict limits on actions while a
parallel cursor is open (e.g., only allowing FETCH) could prevent some
issues, but it's highly restrictive and doesn't address all problems, such
as potential conflicts in transaction state and lock management. Both
approaches are challenging but potentially feasible with enough effort.

[9] Robert Haas stated if you restrict operations to only fetching from the
cursor and ensure that both protocol messages and SQL commands are locked
down, with errors automatically killing the workers, this approach might
work. However, challenges remain, such as handling errors when the leader
is idle, since reporting these errors is tricky due to protocol
limitations. Additionally, if fetching from the cursor is the only action
allowed, using `PQsetSingleRowMode` instead of a cursor might be a simpler
solution.

[10] Robert Haas stated parallel mode is designed to handle errors during
query execution only. Extending parallel mode beyond a query's context
introduces significant risks, including crashes or incorrect results.
Restricting post-query operations to a very limited set might work, but
allowing arbitrary PL/pgSQL code execution is too broad and risky.

This is the summary of the topic. Let me ask the question again: Can
declared cursors be developed that support parallelism?

[1]
https://www.postgresql.org/message-id/98760523-a0d2-8705-38e3-c4602ecf2448%402ndquadrant.com
[2]
https://www.postgresql.org/message-id/CA%2BTgmoaJVMbQbHn3i_Uzz_vSGsbDsavYkPV4Tqz%3DUbg%2Bv8%2BLUQ%40mail.gmail.com
[3]
https://www.postgresql.org/message-id/dfe4adec-ad9b-723c-d501-1eedc96837a3%402ndquadrant.com
[4]
https://www.postgresql.org/message-id/CA%2BTgmobcUxTPLbjFL4PQa_07RC5wdEr-N_Zi%2BEOGuJ8Ui1Vy8Q%40mail.gmail.com
[5]
https://www.postgresql.org/message-id/8bdb6684-09d7-f799-0a6a-362cdc251b31%402ndquadrant.com
[6]
https://www.postgresql.org/message-id/CA%2BTgmoYWoqDUViwPTk-rOJbGG8aqEVAQWBgauVbYaUxznSA%3D%3Dg%40mail.gmail.com
[7]
https://www.postgresql.org/message-id/CANP8%2BjJoAitOY5uM6L8xYjPHK-RjJtmsZ5TZKGrH8m54U9sfzA%40mail.gmail.com
[7-more]
https://www.postgresql.org/message-id/CANP8%2BjJBHDu%2BkgH2Jy34zzx9C5x61LF7JkWODPaRKuXMhD0vKw%40mail.gmail.com
[8]
https://www.postgresql.org/message-id/CA%2BTgmobXWEqRUr0k1RS%3Dx4NaAbi%2B9R-0z%2Bpp%2BVEuLYKorjbbYg%40mail.gmail.com
[9]
https://www.postgresql.org/message-id/CA%2BTgmoaBYa_WDSFhZA6sPYVnBWH_vwqpXcy3iA93oqaecj2p7A%40mail.gmail.com
[10]
https://www.postgresql.org/message-id/CA%2BTgmoZfwTc8DcoNQ_V2%3DUbmY%3DsVuPg%2B%2BMF6eNaWDVjaBczwrQ%40mail.gmail.com

Regards,

--
Meftun Cincioglu

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-08-12 09:10:58 Re: A problem about partitionwise join
Previous Message John Naylor 2024-08-12 08:34:26 Re: Linux likely() unlikely() for PostgreSQL