Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low?
Date: 2023-10-28 23:45:08
Message-ID: ZT2dBCFrx89lLLgM@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Has anything been done about this issue?

---------------------------------------------------------------------------

On Wed, Aug 3, 2022 at 02:56:12AM +1200, David Rowley wrote:
> Over on [1] I was complaining that I thought DEFAULT_FDW_TUPLE_COST,
> which is defined as 0.01 was unrealistically low.
>
> For comparison, cpu_tuple_cost, something we probably expect to be in
> a CPU cache is also 0.01. We've defined DEFAULT_PARALLEL_TUPLE_COST
> to be 0.1, which is 10x cpu_tuple_cost. That's coming from a shared
> memory segment. So why do we think DEFAULT_FDW_TUPLE_COST should be
> the same as cpu_tuple_cost when that's probably pulling a tuple from
> some remote server over some (possibly slow) network?
>
> I did a little experiment in the attached .sql file and did some maths
> to try to figure out what it's really likely to be costing us. I tried
> this with and without the attached hack to have the planner not
> consider remote grouping just to see how much slower pulling a million
> tuples through the FDW would cost.
>
> I setup a loopback server on localhost (which has about the lowest
> possible network latency) and found the patched query to the foreign
> server took:
>
> Execution Time: 530.000 ms
>
> This is pulling all million tuples over and doing the aggregate locally.
>
> Unpatched, the query took:
>
> Execution Time: 35.334 ms
>
> so about 15x faster.
>
> If I take the seqscan cost for querying the local table, which is
> 14425.00 multiply that by 15 (the extra time it took to pull the 1
> million tuples) then divide by 1 million to get the extra cost per
> tuple, then that comes to about 0.216. So that says
> DEFAULT_FDW_TUPLE_COST is about 21x lower than it should be.
>
> I tried cranking DEFAULT_FDW_TUPLE_COST up to 0.5 to see what plans
> would change in the postgres_fdw regression tests and quite a number
> changed. Many seem to be pushing the sorts down to the remote server
> where they were being done locally before. A few others just seem
> weird. For example, the first one seems to be blindly adding a remote
> sort when it does no good. I think it would take quite a bit of study
> with a debugger to figure out what's going on with many of these.
>
> Does anyone have any ideas why DEFAULT_FDW_TUPLE_COST was set so low?
>
> Does anyone object to it being set to something more realistic?
>
> David
>
> [1] https://www.postgresql.org/message-id/CAApHDvpXiXLxg4TsA8P_4etnuGQqAAbHWEOM4hGe=DCaXmi_jA@mail.gmail.com

> diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
> index 64632db73c..b4e3b91d7f 100644
> --- a/src/backend/optimizer/plan/planner.c
> +++ b/src/backend/optimizer/plan/planner.c
> @@ -3921,7 +3921,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
> * If there is an FDW that's responsible for all baserels of the query,
> * let it consider adding ForeignPaths.
> */
> - if (grouped_rel->fdwroutine &&
> + if (0 && grouped_rel->fdwroutine &&
> grouped_rel->fdwroutine->GetForeignUpperPaths)
> grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG,
> input_rel, grouped_rel,

> ALTER SYSTEM SET max_parallel_workers_per_gather = 0;
> SELECT pg_reload_conf();
>
> CREATE EXTENSION postgres_fdw;
> CREATE EXTENSION pg_prewarm;
>
>
> DO $d$
> BEGIN
> EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
> OPTIONS (dbname '$$||current_database()||$$',
> port '$$||current_setting('port')||$$'
> )$$;
> END;
> $d$;
>
> CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
>
> CREATE TABLE public.t (a INT);
> INSERT INTO t SELECT x FROM generate_series(1,1000000) x;
> VACUUM FREEZE ANALYZE t;
> SELECT pg_prewarm('t');
>
> CREATE FOREIGN TABLE ft (
> a INT
> ) SERVER loopback OPTIONS (schema_name 'public', table_name 't');
>
> EXPLAIN (ANALYZE) SELECT COUNT(*) FROM ft;
> EXPLAIN (ANALYZE) SELECT COUNT(*) FROM t;
>
> DROP FOREIGN TABLE ft;
> DROP TABLE t;
> DROP SERVER loopback CASCADE;
> ALTER SYSTEM RESET max_parallel_workers_per_gather;
> SELECT pg_reload_conf();

> --- "expected\\postgres_fdw.out" 2022-08-03 01:34:42.806967000 +1200
> +++ "results\\postgres_fdw.out" 2022-08-03 02:33:40.719712900 +1200
> @@ -2164,8 +2164,8 @@
> -- unsafe conditions on one side (c8 has a UDT), not pushed down.
> EXPLAIN (VERBOSE, COSTS OFF)
> SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
> - QUERY PLAN
> ------------------------------------------------------------------------------
> + QUERY PLAN
> +------------------------------------------------------------------------------------------------------------------------------
> Limit
> Output: t1.c1, t2.c1, t1.c3
> -> Sort
> @@ -2182,7 +2182,7 @@
> -> Foreign Scan on public.ft1 t1
> Output: t1.c1, t1.c3
> Filter: (t1.c8 = 'foo'::user_enum)
> - Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1"
> + Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1" ORDER BY c3 ASC NULLS LAST, "C 1" ASC NULLS LAST
> (17 rows)
>
> SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
> @@ -2873,13 +2873,13 @@
> Sort
> Output: (sum(c1)), c2
> Sort Key: (sum(ft1.c1))
> - -> HashAggregate
> + -> GroupAggregate
> Output: sum(c1), c2
> Group Key: ft1.c2
> Filter: (avg((ft1.c1 * ((random() <= '1'::double precision))::integer)) > '100'::numeric)
> -> Foreign Scan on public.ft1
> Output: c1, c2
> - Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1"
> + Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC NULLS LAST
> (10 rows)
>
> -- Remote aggregate in combination with a local Param (for the output
> @@ -3123,12 +3123,12 @@
> Sort
> Output: (sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) <= '1'::double precision))), c2
> Sort Key: (sum(ft1.c1) FILTER (WHERE ((((ft1.c1 / ft1.c1))::double precision * random()) <= '1'::double precision)))
> - -> HashAggregate
> + -> GroupAggregate
> Output: sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) <= '1'::double precision)), c2
> Group Key: ft1.c2
> -> Foreign Scan on public.ft1
> Output: c1, c2
> - Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1"
> + Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC NULLS LAST
> (9 rows)
>
> explain (verbose, costs off)
> @@ -3885,24 +3885,21 @@
> -- subquery using stable function (can't be sent to remote)
> PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
> EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20);
> - QUERY PLAN
> -----------------------------------------------------------------------------------------------------------
> - Sort
> + QUERY PLAN
> +----------------------------------------------------------------------------------------------------------------------------------
> + Nested Loop Semi Join
> Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> - Sort Key: t1.c1
> - -> Nested Loop Semi Join
> + Join Filter: (t1.c3 = t2.c3)
> + -> Foreign Scan on public.ft1 t1
> Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> - Join Filter: (t1.c3 = t2.c3)
> - -> Foreign Scan on public.ft1 t1
> - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
> - -> Materialize
> + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20)) ORDER BY "C 1" ASC NULLS LAST
> + -> Materialize
> + Output: t2.c3
> + -> Foreign Scan on public.ft2 t2
> Output: t2.c3
> - -> Foreign Scan on public.ft2 t2
> - Output: t2.c3
> - Filter: (date(t2.c4) = '01-17-1970'::date)
> - Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
> -(15 rows)
> + Filter: (date(t2.c4) = '01-17-1970'::date)
> + Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
> +(12 rows)
>
> EXECUTE st2(10, 20);
> c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8
> @@ -9381,21 +9378,19 @@
> -- test FOR UPDATE; partitionwise join does not apply
> EXPLAIN (COSTS OFF)
> SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1;
> - QUERY PLAN
> ---------------------------------------------------------------
> + QUERY PLAN
> +--------------------------------------------------------
> LockRows
> - -> Sort
> - Sort Key: t1.a
> - -> Hash Join
> - Hash Cond: (t2.b = t1.a)
> + -> Nested Loop
> + Join Filter: (t1.a = t2.b)
> + -> Append
> + -> Foreign Scan on ftprt1_p1 t1_1
> + -> Foreign Scan on ftprt1_p2 t1_2
> + -> Materialize
> -> Append
> -> Foreign Scan on ftprt2_p1 t2_1
> -> Foreign Scan on ftprt2_p2 t2_2
> - -> Hash
> - -> Append
> - -> Foreign Scan on ftprt1_p1 t1_1
> - -> Foreign Scan on ftprt1_p2 t1_2
> -(12 rows)
> +(10 rows)
>
> SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1;
> a | b
> @@ -9430,18 +9425,16 @@
> SET enable_partitionwise_aggregate TO false;
> EXPLAIN (COSTS OFF)
> SELECT a, sum(b), min(b), count(*) FROM pagg_tab GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
> - QUERY PLAN
> ------------------------------------------------------------
> - Sort
> - Sort Key: pagg_tab.a
> - -> HashAggregate
> - Group Key: pagg_tab.a
> - Filter: (avg(pagg_tab.b) < '22'::numeric)
> - -> Append
> - -> Foreign Scan on fpagg_tab_p1 pagg_tab_1
> - -> Foreign Scan on fpagg_tab_p2 pagg_tab_2
> - -> Foreign Scan on fpagg_tab_p3 pagg_tab_3
> -(9 rows)
> + QUERY PLAN
> +-----------------------------------------------------
> + GroupAggregate
> + Group Key: pagg_tab.a
> + Filter: (avg(pagg_tab.b) < '22'::numeric)
> + -> Append
> + -> Foreign Scan on fpagg_tab_p1 pagg_tab_1
> + -> Foreign Scan on fpagg_tab_p2 pagg_tab_2
> + -> Foreign Scan on fpagg_tab_p3 pagg_tab_3
> +(7 rows)
>
> -- Plan with partitionwise aggregates is enabled
> SET enable_partitionwise_aggregate TO true;
> @@ -9475,34 +9468,32 @@
> -- Should have all the columns in the target list for the given relation
> EXPLAIN (VERBOSE, COSTS OFF)
> SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
> - QUERY PLAN
> -------------------------------------------------------------------------
> - Sort
> - Output: t1.a, (count(((t1.*)::pagg_tab)))
> + QUERY PLAN
> +--------------------------------------------------------------------------------------------
> + Merge Append
> Sort Key: t1.a
> - -> Append
> - -> HashAggregate
> - Output: t1.a, count(((t1.*)::pagg_tab))
> - Group Key: t1.a
> - Filter: (avg(t1.b) < '22'::numeric)
> - -> Foreign Scan on public.fpagg_tab_p1 t1
> - Output: t1.a, t1.*, t1.b
> - Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1
> - -> HashAggregate
> - Output: t1_1.a, count(((t1_1.*)::pagg_tab))
> - Group Key: t1_1.a
> - Filter: (avg(t1_1.b) < '22'::numeric)
> - -> Foreign Scan on public.fpagg_tab_p2 t1_1
> - Output: t1_1.a, t1_1.*, t1_1.b
> - Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2
> - -> HashAggregate
> - Output: t1_2.a, count(((t1_2.*)::pagg_tab))
> - Group Key: t1_2.a
> - Filter: (avg(t1_2.b) < '22'::numeric)
> - -> Foreign Scan on public.fpagg_tab_p3 t1_2
> - Output: t1_2.a, t1_2.*, t1_2.b
> - Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3
> -(25 rows)
> + -> GroupAggregate
> + Output: t1.a, count(((t1.*)::pagg_tab))
> + Group Key: t1.a
> + Filter: (avg(t1.b) < '22'::numeric)
> + -> Foreign Scan on public.fpagg_tab_p1 t1
> + Output: t1.a, t1.*, t1.b
> + Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 ORDER BY a ASC NULLS LAST
> + -> GroupAggregate
> + Output: t1_1.a, count(((t1_1.*)::pagg_tab))
> + Group Key: t1_1.a
> + Filter: (avg(t1_1.b) < '22'::numeric)
> + -> Foreign Scan on public.fpagg_tab_p2 t1_1
> + Output: t1_1.a, t1_1.*, t1_1.b
> + Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 ORDER BY a ASC NULLS LAST
> + -> GroupAggregate
> + Output: t1_2.a, count(((t1_2.*)::pagg_tab))
> + Group Key: t1_2.a
> + Filter: (avg(t1_2.b) < '22'::numeric)
> + -> Foreign Scan on public.fpagg_tab_p3 t1_2
> + Output: t1_2.a, t1_2.*, t1_2.b
> + Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 ORDER BY a ASC NULLS LAST
> +(23 rows)
>
> SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
> a | count
> @@ -9518,24 +9509,23 @@
> -- When GROUP BY clause does not match with PARTITION KEY.
> EXPLAIN (COSTS OFF)
> SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING sum(a) < 700 ORDER BY 1;
> - QUERY PLAN
> ------------------------------------------------------------------
> - Sort
> - Sort Key: pagg_tab.b
> - -> Finalize HashAggregate
> - Group Key: pagg_tab.b
> - Filter: (sum(pagg_tab.a) < 700)
> - -> Append
> - -> Partial HashAggregate
> - Group Key: pagg_tab.b
> - -> Foreign Scan on fpagg_tab_p1 pagg_tab
> - -> Partial HashAggregate
> - Group Key: pagg_tab_1.b
> - -> Foreign Scan on fpagg_tab_p2 pagg_tab_1
> - -> Partial HashAggregate
> - Group Key: pagg_tab_2.b
> - -> Foreign Scan on fpagg_tab_p3 pagg_tab_2
> -(15 rows)
> + QUERY PLAN
> +-----------------------------------------------------------
> + Finalize GroupAggregate
> + Group Key: pagg_tab.b
> + Filter: (sum(pagg_tab.a) < 700)
> + -> Merge Append
> + Sort Key: pagg_tab.b
> + -> Partial GroupAggregate
> + Group Key: pagg_tab.b
> + -> Foreign Scan on fpagg_tab_p1 pagg_tab
> + -> Partial GroupAggregate
> + Group Key: pagg_tab_1.b
> + -> Foreign Scan on fpagg_tab_p2 pagg_tab_1
> + -> Partial GroupAggregate
> + Group Key: pagg_tab_2.b
> + -> Foreign Scan on fpagg_tab_p3 pagg_tab_2
> +(14 rows)
>
> -- ===================================================================
> -- access rights and superuser

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2023-10-29 00:38:26 Re: COPY TO (FREEZE)?
Previous Message Bruce Momjian 2023-10-28 16:58:47 Re: remove useless comments