From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Arne Roland <A(dot)Roland(at)index(dot)de> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path |
Date: | 2022-01-12 22:43:04 |
Message-ID: | ed041b8e-1440-a09a-3632-85b3985df598@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Pushed, after clarifying the comments a bit.
I also looked into what would it take to consider incremental paths, and
in principle it doesn't seem all that complicated. The attached PoC
patch extends get_cheapest_fractional_path_for_pathkeys() to optionally
build incremental sort on paths if needed. There are two GUCs that make
experimenting simpler:
* enable_fractional_paths - disables fractional paths entirely, i.e. we
get behavior without the part I already pushed
* enable_fractional_incremental_paths - disables the incremental sort
part added by the attached patch
With this, I get this plan (see the test in partitionwise_join.sql)
test=# EXPLAIN (COSTS OFF)
test-# SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id1, id2)
ORDER BY id1 ASC, id2 ASC LIMIT 10;
QUERY PLAN
------------------------------------------------------------------------------
Limit
-> Merge Left Join
Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
-> Append
-> Index Only Scan using fract_t0_id1_id2_idx on
fract_t0 x_1
-> Incremental Sort
Sort Key: x_2.id1, x_2.id2
Presorted Key: x_2.id1
-> Index Scan using fract_t1_pkey on fract_t1 x_2
-> Materialize
-> Append
-> Incremental Sort
Sort Key: y_1.id1, y_1.id2
Presorted Key: y_1.id1
-> Index Scan using fract_t0_pkey on
fract_t0 y_1
Index Cond: (id1 = id1)
Filter: (id2 = id2)
-> Incremental Sort
Sort Key: y_2.id1, y_2.id2
Presorted Key: y_2.id1
-> Index Scan using fract_t1_pkey on
fract_t1 y_2
Index Cond: (id1 = id1)
Filter: (id2 = id2)
(23 rows)
instead of
QUERY PLAN
------------------------------------------------------------------------------
Limit
-> Incremental Sort
Sort Key: x.id1, x.id2
Presorted Key: x.id1
-> Merge Left Join
Merge Cond: (x.id1 = y.id1)
Join Filter: (x.id2 = y.id2)
-> Append
-> Index Scan using fract_t0_pkey on fract_t0 x_1
-> Index Scan using fract_t1_pkey on fract_t1 x_2
-> Materialize
-> Append
-> Index Scan using fract_t0_pkey on
fract_t0 y_1
-> Index Scan using fract_t1_pkey on
fract_t1 y_2
(14 rows)
i.e. the incremental sorts moved below the merge join (and the cost is
lower, but that's not shown here).
So that seems reasonable, but there's a couple issues too:
1) Planning works (hence we can run explain), but execution fails
because of segfault in CheckVarSlotCompatibility - it gets NULL slot for
some reason. I haven't looked into the details, but I'd guess we need to
pass a different rel to create_incrementalsort_path, not childrel.
2) enable_partitionwisejoin=on seems to cause some confusion, because it
results in picking a different plan with higher cost. But that's clearly
not because of this patch.
3) I'm still a bit skeptical about the cost of this implementation - it
builds the incrementalsort path, just to throw most of the paths away.
It'd be much better to just calculate the cost, which should be much
cheaper, and only do the heavy work for the one "best" path.
4) One thing I did not realize before is what pathkeys we even consider
here. Imagine you have two tables:
CREATE TABLE t1 (a int, b int, primary key (a));
CREATE TABLE t2 (a int, b int, primary key (a));
and query
SELECT * FROM t1 JOIN t2 USING (a,b);
It seems reasonable to also consider pathkeys (a,b) because that's make
e.g. mergejoin much cheaper, right? But no, we'll not do that - we only
consider pathkeys that at least one child relation has, so in this case
it's just (a). Which means we'll never consider incremental sort for
this particular example.
It's a bit strange, because it's enough to create index on (a,b) for
just one of the relations, and it'll suddenly consider building
incremental sort on both sides.
I don't plan to pursue this further at this point, so I pushed the first
part because that's a fairly simple improvement over what we have now.
But it seems like a fairly promising area for improvements.
Also, the non-intuitive effects of enabling partition-wise joins (i.e.
picking plans with higher estimated costs) is something worth exploring,
I guess. But that's a separate issue.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0001-Consider-incremental-sort-in-generate_orderedappend_.patch | text/x-patch | 9.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Sergey Shinderuk | 2022-01-12 23:01:24 | Re: Improve error handling of HMAC computations and SCRAM |
Previous Message | John Naylor | 2022-01-12 22:33:17 | Re: A qsort template |