Asynchronous MergeAppend

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Asynchronous MergeAppend
Date: 2024-07-17 13:24:28
Message-ID: 59be194c5a409fb9fc9f2031581b8a44@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

I'd like to make MergeAppend node Async-capable like Append node.
Nowadays when planner chooses MergeAppend plan, asynchronous execution
is not possible. With attached patches you can see plans like

EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM async_pt WHERE b % 100 = 0 ORDER BY b, a;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Merge Append
Sort Key: async_pt.b, async_pt.a
-> Async Foreign Scan on public.async_p1 async_pt_1
Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE (((b %
100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
-> Async Foreign Scan on public.async_p2 async_pt_2
Output: async_pt_2.a, async_pt_2.b, async_pt_2.c
Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE (((b %
100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST

This can be quite profitable (in our test cases you can gain up to two
times better speed with MergeAppend async execution on remote servers).

Code for asynchronous execution in Merge Append was mostly borrowed from
Append node.

What significantly differs - in ExecMergeAppendAsyncGetNext() you must
return tuple from the specified slot.
Subplan number determines tuple slot where data should be retrieved to.
When subplan is ready to provide some data,
it's cached in ms_asyncresults. When we get tuple for subplan, specified
in ExecMergeAppendAsyncGetNext(),
ExecMergeAppendAsyncRequest() returns true and loop in
ExecMergeAppendAsyncGetNext() ends. We can fetch data for
subplans which either don't have cached result ready or have already
returned them to the upper node. This
flag is stored in ms_has_asyncresults. As we can get data for some
subplan either earlier or after loop in ExecMergeAppendAsyncRequest(),
we check this flag twice in this function.
Unlike ExecAppendAsyncEventWait(), it seems
ExecMergeAppendAsyncEventWait() doesn't need a timeout - as there's no
need to get result
from synchronous subplan if a tuple form async one was explicitly
requested.

Also we had to fix postgres_fdw to avoid directly looking at Append
fields. Perhaps, accesors to Append fields look strange, but allows
to avoid some code duplication. I suppose, duplication could be even
less if we reworked async Append implementation, but so far I haven't
tried to do this to avoid big diff from master.

Also mark_async_capable() believes that path corresponds to plan. This
can be not true when create_[merge_]append_plan() inserts sort node.
In this case mark_async_capable() can treat Sort plan node as some other
and crash, so there's a small fix for this.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment Content-Type Size
v1-0001-mark_async_capable-subpath-should-match-subplan.patch text/x-diff 1.9 KB
v1-0002-MergeAppend-should-support-Async-Foreign-Scan-subpla.patch text/x-diff 34.7 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Simpson 2024-07-17 13:24:43 Re: filesystem full during vacuum - space recovery issues
Previous Message torikoshia 2024-07-17 13:21:07 Re: Add new COPY option REJECT_LIMIT