From: | "Markhof, Ingolf" <ingolf(dot)markhof(at)de(dot)verizon(dot)com> |
---|---|
To: | Sebastian Dressler <sebastian(at)swarm64(dot)com> |
Cc: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Foreign table performance issue / PostgreSQK vs. ORACLE |
Date: | 2021-02-02 12:05:53 |
Message-ID: | 1e568ab5fb204e79b6f55620ccbd85c5@MS-FRA-E13EX02.intl.ad.vzwcorp.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi!
My PostgreSQL version is 11.8.
The query I am running is referring to a number of foreign tables. The first one (table1) has to IDs, let's say ID_A and ID_B. While ID_A is unique, ID_B is not. In my case, I am pulling formation for a value of IB_B for which about 800 rows (with unique ID_A) exist. I found:
While
select * from my_view where id_b='some value';
seemingly runs "forever" (I cancelled execution after a few hours), the following completes in about 1 hr:
select * from my_view where ia_a in (
select id_a from table1 where id_b='some value'
);
So, I tried smaller chunks of ID_a and found the execution time is non-linear with respect to number of IDs. For e.g. 50 ID_A's, it was completed in about 12 sec.
Next I have split the ~800 ID_A's into chunks of 50 and submitted these 16 queries one after another. They all completed in about 12 secs, each.
I then found the option fetch_size, e.g. ALTER SERVER some_server OPTIONS (fetch_size '50000'). A chunk of 50 now executes in 2 seconds (instead of 12 before).
So, I found the "size" of the query has a serious impact to the execution time. I don't really understand why execution 16*50 takes 16*2 secs only, but executing 1*800 takes about 3000 seconds...
Regards,
Ingolf
-----Original Message-----
From: Sebastian Dressler [mailto:sebastian(at)swarm64(dot)com]
Sent: 30 January 2021 11:45
To: Markhof, Ingolf <ingolf(dot)markhof(at)de(dot)verizon(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: [E] Re: Foreign table performance issue / PostgreSQK vs. ORACLE
Hi Ingolf,
> On 29. Jan 2021, at 13:56, Markhof, Ingolf <ingolf(dot)markhof(at)de(dot)verizon(dot)com> wrote:
>
> Hi!
>
> I am struggling with the slow performance when running queries referring to foreign tables. – Yes, I know… - Please read the whole story!
Done and it rings a bell or two.
> The set-up basically is a production database and a reporting database. As names indicate, the production database is used for production, the reporting database is for analysis. On the reporting database, the only way to access product data is via foreign tables that link to the related production tables.
>
> Now, while some queries on the reporting service run fine, some don't even return any data after hours.
>
> However, the same set-up worked fine in Oracle before. Reporting wasn't always fast, but it delivered results in acceptable time. A query executed on the Oracle reporting server returns data in e.g. 30 seconds. But running the query translated to PostgreSQL on the PostgreSQL DB does not deliver a single row after hours (!) of run time.
>
> So, I wonder: Is there a fundamental difference between Oracle database links and foreign tables in PostgreSQL that could explain the different run times? Could there be some tuning option in PostgreSQL to make queries via foreign tables faster (e.g. I heard about option fetch_size)?
You did not explicitly mention it, but I assume you are using postgres_fdw to connect from reporting (R) to production (P). Thomas and Tom already mentioned incomplete/non-existing/non-applicable filter pushdowns. I want to add another probable root cause to the list explaining the behavior you experience.
The postgres_fdw uses a CURSOR on P to execute the query. While this guarantees transaction safety, it also prohibits parallelism (PostgreSQL server-side cursors enforce a sequential plan).
As a result, depending on the size of tables, indexes, and filters pushed down (or not), this probably results in slow-running queries. IMO, the worst-case scenario is that a sequential table scan without any filtering, and a single worker runs on the target.
Of course, you can try to optimize schemas on P and queries on R, enabling more filter pushdown and eventually a faster execution. However, I believe this does not work with your entire workload, i.e. there will always be performance gaps.
The parallelism issue is theoretically fixable by utilizing partitions on P. R then connects to P with multiple postgres_fdw-backed child tables. However, this will only work with a patch to postgres_fdw to implement "IsForeignScanParallelSafe" (see [1] for a possible implementation). Without this method, there will be no parallelism again. Without, the partitions scan occurs sequentially, not showing a performance gain.
I want to mention there are proprietary options available (re-)enabling PostgreSQL parallelism with cursors. Such an extension can potentially fix your performance issue. However, I have not tried it so far with a setup similar to yours.
Cheers,
Sebastian
--
Sebastian Dressler, Solution Architect, Swarm64 AS
+49 30 994 0496 72 | sebastian(at)swarm64(dot)com
Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des Aufsichtsrats: Francesco de Maio
From | Date | Subject | |
---|---|---|---|
Next Message | Andrus | 2021-02-02 12:12:06 | Re: permission denied for large object 200936761 |
Previous Message | unilynx | 2021-02-02 10:32:55 | Re: Index created with PGSQL11 on ubuntu 18.04 corrupt with PGSQL11 on ubuntu 20.04 |