From: | Matthew Wakeling <matthew(at)flymine(dot)org> |
---|---|
To: | Віталій Тимчишин <tivv00(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Very specialised query |
Date: | 2009-03-27 17:34:26 |
Message-ID: | alpine.DEB.2.00.0903271708080.21772@aragorn.flymine.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, 27 Mar 2009, Віталій Тимчишин wrote:
> ...an index on (objectid, start) would help...
Definitely.
> You could try adding "AND l2.start > l1.start" to the first query.
> This will drop symmetric half of intersections (the ones that will
> remain are l2 inside or to the left of l1), but you can redo results by
> id1,id2 union all id2, id1 and may allow to use start index for
> "between"
That's remarkably clever, and should have been obvious. Thanks.
Adding the constraint as you say does indeed make the query fast. However,
there seems to be a problem with the planner, in that it does not
distinguish between the costs of two alternative plans, which have vastly
different real costs. Consider the following:
SELECT
l1.id AS id1,
l2.id AS id2
FROM
location l1,
location l2
WHERE
l1.objectid = 228000093
AND l2.objectid = 228000093
AND l1.id <> l2.id
AND l1.start < l2.end
AND l1.end > l2.start
AND l1.start < l2.start
UNION ALL
SELECT
l1.id AS id1, l2.id AS id2
FROM
location l1,
location l2
WHERE
l1.objectid = 228000093
AND l2.objectid = 228000093
AND l1.id <> l2.id
AND l1.start < l2.end
AND l1.end > l2.start
AND l1.start >= l2.start;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..20479179.74 rows=138732882 width=8)
-> Nested Loop (cost=0.00..9545925.46 rows=69366441 width=8)
Join Filter: ((l1.id <> l2.id) AND (l1.start < l2.end))
-> Index Scan using location_test_obj_end on location l1 (cost=0.00..55966.07 rows=43277 width=12)
Index Cond: (objectid = 228000093)
-> Index Scan using location_test_obj_start on location l2 (cost=0.00..123.10 rows=4809 width=12)
Index Cond: ((l2.objectid = 228000093) AND (l1.end > l2.start) AND (l1.start < l2.start))
-> Nested Loop (cost=0.00..9545925.46 rows=69366441 width=8)
Join Filter: ((l1.id <> l2.id) AND (l1.start < l2.end))
-> Index Scan using location_test_obj_end on location l1 (cost=0.00..55966.07 rows=43277 width=12)
Index Cond: (objectid = 228000093)
-> Index Scan using location_test_obj_start on location l2 (cost=0.00..123.10 rows=4809 width=12)
Index Cond: ((l2.objectid = 228000093) AND (l1.end > l2.start) AND (l1.start >= l2.start))
(13 rows)
Notice the two different index conditions:
(l1.end > l2.start) AND (l1.start < l2.start) - "between"
(l1.end > l2.start) AND (l1.start >= l2.start) - open-ended
Both have a cost of (cost=0.00..123.10 rows=4809 width=12)
Postgres estimates these two index scans to be equivalent in cost, where
they are actually vastly different in real cost. Shouldn't Postgres favour
a "between" index scan over an open-ended one?
Matthew
--
[About NP-completeness] These are the problems that make efficient use of
the Fairy Godmother. -- Computer Science Lecturer
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff | 2009-03-27 17:54:37 | Re: I have a fusion IO drive available for testing |
Previous Message | david | 2009-03-27 17:30:25 | Re: I have a fusion IO drive available for testing |