Re: Asynchronous MergeAppend

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Asynchronous MergeAppend
Date: 2024-08-10 20:24:43
Message-ID: 764dd8b8-6374-4f5a-aac7-d8e3f6ebe5fd@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi! Thank you for your work on this subject! I think this is a very
useful optimization)

While looking through your code, I noticed some points that I think
should be taken into account. Firstly, I noticed only two tests to
verify the functionality of this function and I think that this is not
enough.
Are you thinking about adding some tests with queries involving, for
example, join connections with different tables and unusual operators?

In addition, I have a question about testing your feature on a
benchmark. Are you going to do this?

On 17.07.2024 16:24, Alexander Pyhalov wrote:
> Hello.
>
> I'd like to make MergeAppend node Async-capable like Append node.
> Nowadays when planner chooses MergeAppend plan, asynchronous execution
> is not possible. With attached patches you can see plans like
>
> EXPLAIN (VERBOSE, COSTS OFF)
> SELECT * FROM async_pt WHERE b % 100 = 0 ORDER BY b, a;
>                                                           QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------
>
>  Merge Append
>    Sort Key: async_pt.b, async_pt.a
>    ->  Async Foreign Scan on public.async_p1 async_pt_1
>          Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
>          Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE (((b %
> 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
>    ->  Async Foreign Scan on public.async_p2 async_pt_2
>          Output: async_pt_2.a, async_pt_2.b, async_pt_2.c
>          Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE (((b %
> 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
>
> This can be quite profitable (in our test cases you can gain up to two
> times better speed with MergeAppend async execution on remote servers).
>
> Code for asynchronous execution in Merge Append was mostly borrowed
> from Append node.
>
> What significantly differs - in ExecMergeAppendAsyncGetNext() you must
> return tuple from the specified slot.
> Subplan number determines tuple slot where data should be retrieved
> to. When subplan is ready to provide some data,
> it's cached in ms_asyncresults. When we get tuple for subplan,
> specified in ExecMergeAppendAsyncGetNext(),
> ExecMergeAppendAsyncRequest() returns true and loop in
> ExecMergeAppendAsyncGetNext() ends. We can fetch data for
> subplans which either don't have cached result ready or have already
> returned them to the upper node. This
> flag is stored in ms_has_asyncresults. As we can get data for some
> subplan either earlier or after loop in ExecMergeAppendAsyncRequest(),
> we check this flag twice in this function.
> Unlike ExecAppendAsyncEventWait(), it seems
> ExecMergeAppendAsyncEventWait() doesn't need a timeout - as there's no
> need to get result
> from synchronous subplan if a tuple form async one was explicitly
> requested.
>
> Also we had to fix postgres_fdw to avoid directly looking at Append
> fields. Perhaps, accesors to Append fields look strange, but allows
> to avoid some code duplication. I suppose, duplication could be even
> less if we reworked async Append implementation, but so far I haven't
> tried to do this to avoid big diff from master.
>
> Also mark_async_capable() believes that path corresponds to plan. This
> can be not true when create_[merge_]append_plan() inserts sort node.
> In this case mark_async_capable() can treat Sort plan node as some
> other and crash, so there's a small fix for this.

I think you should add this explanation to the commit message because
without it it's hard to understand the full picture of how your code works.

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2024-08-10 20:57:35 Re: Vacuum statistics
Previous Message Andrei Zubkov 2024-08-10 19:37:25 Re: Vacuum statistics