From: | Alexey Nalbat <nalbat(at)price(dot)ru> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | unnesesary sorting after Merge Full Join |
Date: | 2008-02-21 10:08:00 |
Message-ID: | 200802211308.00798.nalbat@price.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello.
I'd like to use ORDER BY in any specified order and LIMIT, OFFSET for paging query results.
The query is FULL OUTER JOIN of two tables by field id. I think the results of Merge Full Join
to be ordered by some "combined id". And there is no need in extra Sort if I specify ORDER BY
that "combined id". But unfortunately it is not so.
Here is a simple example:
-- BEGIN
create table t1 as select generate_series(1,1000000,2) as id;
create table t2 as select generate_series(1,1000000,3) as id;
create index i1 on t1 ( id );
create index i2 on t2 ( id );
analyze t1;
analyze t2;
explain analyze
select id, t1.*, t2.*
from t1 natural full join t2
order by 1 limit 10 offset 10;
drop table t1;
drop table t2;
-- END
Postgresql chooses such plan:
Limit (cost=44080.12..44080.15 rows=10 width=8) (actual time=6724.850..6724.906 rows=10 loops=1)
-> Sort (cost=44080.10..45330.10 rows=500000 width=8) (actual time=6724.806..6724.845 rows=20 loops=1)
Sort Key: (COALESCE(t1.id, t2.id))
Sort Method: top-N heapsort Memory: 25kB
-> Merge Full Join (cost=0.00..30775.28 rows=500000 width=8) (actual time=0.142..5237.289 rows=666667 loops=1)
Merge Cond: (t1.id = t2.id)
-> Index Scan using i1 on t1 (cost=0.00..15212.30 rows=500000 width=4) (actual time=0.079..1188.601 rows=500000 loops=1)
-> Index Scan using i2 on t2 (cost=0.00..10146.30 rows=333334 width=4) (actual time=0.051..793.635 rows=333334 loops=1)
The desired plan is much faster:
Limit (cost=0.62..1.23 rows=10 width=8) (actual time=0.262..0.366 rows=10 loops=1)
-> Merge Full Join (cost=0.00..30775.28 rows=500000 width=8) (actual time=0.156..0.303 rows=20 loops=1)
Merge Cond: (t1.id = t2.id)
-> Index Scan using i1 on t1 (cost=0.00..15212.30 rows=500000 width=4) (actual time=0.088..0.120 rows=15 loops=1)
-> Index Scan using i2 on t2 (cost=0.00..10146.30 rows=333334 width=4) (actual time=0.056..0.078 rows=11 loops=1)
I found comment in src/backend/optimizer/path/pathkeys.c:
* EXCEPTION: in a FULL or RIGHT join, we cannot treat the result as
* having the outer path's path keys, because null lefthand rows may be
* inserted at random points. It must be treated as unsorted.
How can I get rid of this sorting? Or could this behavior of Merge Full Join be improved?
--
Alexey A. Nalbat
Price Express
http://www.price.ru/
http://www.tyndex.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Huxton | 2008-02-21 10:20:02 | Re: ts_headline |
Previous Message | Mike Leahy | 2008-02-21 10:02:29 | No pgxs.mk with win32 binaries |