Re: Function scan FDW pushdown

From: g(dot)kashkin(at)postgrespro(dot)ru
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Function scan FDW pushdown
Date: 2024-11-05 16:11:00
Message-ID: a7f8c4dc2cede1054229c13769a1ba93@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Pyhalov писал(а) 2021-10-04 10:42:
> Ashutosh Bapat писал 2021-06-15 16:15:
>> Hi Alexander,
>
> Hi.
>
> The current version of the patch is based on asymetric partition-wise
> join.
> Currently it is applied after
> v19-0001-Asymmetric-partitionwise-join.patch from
> on
> https://www.postgresql.org/message-id/792d60f4-37bc-e6ad-68ca-c2af5cbb2d9b@postgrespro.ru
> .
>
>>> So far I don't know how to visualize actual function expression used
>>> in
>>> function RTE, as in postgresExplainForeignScan() es->rtable comes
>>> from
>>> queryDesc->plannedstmt->rtable, and rte->functions is already 0.
>>
>> The actual function expression will be part of the Remote SQL of
>> ForeignScan node so no need to visualize it separately.
>
> We still need to create tuple description for functions in
> get_tupdesc_for_join_scan_tuples(),
> so I had to remove setting newrte->functions to NIL in
> add_rte_to_flat_rtable().
> With rte->functions in place, there's no issues for explain.
>
>>
>> The patch will have problems when there are multiple foreign tables
>> all on different servers or use different FDWs. In such a case the
>> function scan's RelOptInfo will get the fpinfo based on the first
>> foreign table the function scan is paired with during join planning.
>> But that may not be the best foreign table to join. We should be able
>> to plan all the possible joins. Current infra to add one fpinfo per
>> RelOptInfo won't help there. We need something better.
>
> I suppose attached version of the patch is more mature.
>
>>
>> The patch targets only postgres FDW, how do you see this working with
>> other FDWs?
>
> Not now. We introduce necessary APIs for other FDWs, but implementing
> TryShippableJoinPaths()
> doesn't seem straightforward.
>
>>
>> If we come up with the right approach we could use it for 1. pushing
>> down queries with IN () clause 2. joining a small local table with a
>> large foreign table by sending the local table rows down to the
>> foreign server.

Hi,
This is a long-overdue follow-up to the original patch.
Note that this patch contains only the changes required for
function scan pushdown, examples and code related to asymmetric
join are dropped.

In this version, we have fixed some of the issues:
1) Functions returning records are now deparsed correctly
2) I have benchmarked this version alongside the current master
using HammerDB and it shows a consistent ~6000 number of orders
per minute (NOPM) compared to ~100 NOPM on local setup with
1 virtual user and 10 warehouses

The benchmarking was concluded in the following way:
1) schema was created and filled up by buildschema on the first instance
2) foreign schema was imported on the second instance
3) functions created by buildschema (all in namespace 'public') were
exported from the first instance and created on the second one
4) HammerDB was connected to the second instance, and the benchmark
was run there

Note that the HammerDB service functions that were imported to the
second
instance are not the ones being pushed down, only PostgreSQL built-ins
are.

The issue with setting newrte->functions to NIL still persists.
I've attempted to work around it by adding the rte->functions
list to fdw_private field. I stored a copy of rte->functions for each
RTE in the order they are listed in simple_rte_array and accessed it
where the rte->functions were used in the original patch. While this
being pretty straightforward, I found out the hard way that the RTEs
are flattened and the original sequential order known at
postgresGetForeignPlan() and postgresPlanDirectModify() is altered
randomly. It prevents me from looking up the right functions list for
the RTE by its rti in postgresExplainForeignScan() and its var->varno in
get_tupdesc_for_join_scan_tuples(). I am aware that the rte->functions
will now be copied even on instances that don't utilize a FDW, but I
don't see a way to solve it. Any suggestions are welcome.

Best regards,
Gleb Kashkin,
Postgres Professional

Attachment Content-Type Size
0001-Push-join-with-function-scan-to-remote-server.patch text/x-diff 52.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-11-05 16:17:48 Re: Virtual generated columns
Previous Message Fujii Masao 2024-11-05 16:09:12 Re: Making error message more user-friendly with spaces in a URI