Re: Bypassing cursors in postgres_fdw to enable parallel plans

From: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Bypassing cursors in postgres_fdw to enable parallel plans
Date: 2025-01-17 12:03:09
Message-ID: CA+FpmFerinRL9LRY9B2FQzQc3Ba1qtj=vH1e2tVR6vBj1TaSYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 14 Jan 2025 at 18:33, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Jan 6, 2025 at 3:52 AM Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
> wrote:
> > Now, to overcome this limitation, I have worked on this idea (suggested
> by my colleague Bernd Helmle) of bypassing the cursors. The way it works is
> as follows,
> > there is a new GUC introduced postgres_fdw.use_cursor, which when unset
> uses the mode without the cursor. Now, it uses PQsetChunkedRowsMode in
> create_cursor when non-cursor mode is used. The size of the chunk is the
> same as the fetch_size. Now in fetch_more_data, when non-cursor mode is
> used, pgfdw_get_next_result is used to get the chunk in PGresult and
> processed in the same manner as before.
> >
> > Now, the issue comes when there are simultaneous queries, which is the
> case with the join queries where all the tables involved in the join are at
> the local server. Because in that case we have multiple cursors opened at
> the same time and without a cursor mechanism we do not have any information
> or any other structure to know what to fetch from which query. To handle
> that case, we have a flag only_query, which is unset as soon as we have
> assigned the cursor_number >= 2, in postgresBeginForeignScan. Now, in
> fetch_more data, when we find out that only_query is unset, then we fetch
> all the data for the query and store it in a Tuplestore. These tuples are
> then transferred to the fsstate->tuples and then processed as usual.
> >
> > So yes there is a performance drawback in the case of simultaneous
> queries, however, the ability to use parallel plans is really an added
> advantage for the users. Plus, we can keep things as before by this new GUC
> -- use_cursor, in case we are losing more for some workloads. So, in short
> I feel hopeful that this could be a good idea and a good time to improve
> postgres_fdw.
>
> Hi,
>
> I think it might have been nice to credit me in this post, since I
> made some relevant suggestions here off-list, in particular the idea
> of using a Tuplestore when there are multiple queries running. But I
> don't think this patch quite implements what I suggested. Here, you
> have a flag only_query which gets set to true at some point in time
> and thereafter remains true for the lifetime of a session. That means,
> I think, that all future queries will use the tuplestore even though
> there might not be multiple queries running any more. which doesn't
> seem like what we want. And, actually, this looks like it will be set
> as soon as you reach the second query in the same transaction, even if
> the two queries don't overlap. I think what you want to do is test
> whether, at the point where we would need to issue a new query,
> whether an existing query is already running. If not, move that
> query's remaining results into a Tuplestore so you can issue the new
> query.
>
> I'm not sure what the best way to implement that is, exactly. Perhaps
> fsstate->conn_state needs to store some more details about the
> connection, but that's just a guess. I don't think a global variable
> is what you want. Not only is that session-lifetime, but it applies
> globally to every connection to every server. You want to test
> something that is specific to one connection to one server, so it
> needs to be part of a data structure that is scoped that way.
>
> I think you'll want to figure out a good way to test this patch. I
> don't know if we need or can reasonably have automated test cases for
> this new functionality, but you at least want to have a good way to do
> manual testing, so that you can show that the tuplestore is used in
> cases where it's necessary and not otherwise. I'm not yet sure whether
> this patch needs automated test cases or whether they can reasonably
> be written, but you at least want to have a good procedure for manual
> validation so that you can verify that the Tuplestore is used in all
> the cases where it needs to be and, hopefully, no others.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>

Indeed you are right.
Firstly, accept my apologies for not mentioning you in credits for this
work. Thanks a lot for your efforts, discussions with you were helpful in
shaping this patch and bringing it to this level.

Next, yes the last version was using tuplestore for queries within the same
transaction after the second query. To overcome this, I came across this
method to identify if there is any other simultaneous query running with
the current query; now there is an int variable num_queries which is
incremented at every call of postgresBeginForeignScan and decremented at
every call of postgresEndForeignScan. This way, if there are simultaneous
queries running we get the value of num_queries greater than 1. Now, we
check the value of num_queries and use tuplestore only when num_queries is
greater than 1. So, basically the understanding here is that if
postgresBeginForeignScan is called and before the call of
postgresEndForeignScan if another call to postgresBeginForeignScan is made,
then these are simultaneous queries.

I couldn't really find any automated method of testing this, but did it
manually by debugging and/or printing log statements in
postgresBeginForeingScan, postgresEndForeignScan, and fetch_more_data to
confirm indeed there are simultaneous queries, and only they are using
tuplestore. So, the case of simultaneous queries I found was the join
query. Because, there it creates the cursor for one side of the join and
retrieves the first tuples for it and then creates the next cursor for the
other side of join and keeps on reading all the tuples for that query and
then it comes back to first cursor and retrieves all the tuples for that
one. Similarly, it works for the queries with n number of tables in join,
basically what I found is if there are n tables in the join there will be n
open cursors at a time and then they will be closed one by one in the
descending order of the cursor_number. I will think more on the topic of
testing this and will try to come up with a script (in the best case) to
confirm the use of tuplestore in required cases only, or atleast with a set
of steps to do so.

For the regular testing of this feature, I think a regression test with
this new GUC postgres_fdw.use_cursor set to false and running all the
existing tests of postgres_fdw should suffice. What do you think? However,
at the moment when non-cursor mode is used, regression tests are failing.
Some queries require order by because order is changed in non-cursor mode,
but some require more complex changes, I am working on them.

In this version of the patch I have added only the changes mentioned above
and not the regression test modification.

--
Regards,
Rafia Sabih
CYBERTEC PostgreSQL International GmbH

Attachment Content-Type Size
v2-0001-Add-a-fetch-mechanism-without-cursors.patch application/octet-stream 14.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2025-01-17 12:19:43 Re: Eager aggregation, take 3
Previous Message Kirill Reshke 2025-01-17 11:45:19 Re: Add a property to automatically suspend portals as they produce given number of bytes