From: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Improve OR conditions on joined columns. |
Date: | 2017-02-08 22:07:40 |
Message-ID: | 7f70bd5a-5d16-e05c-f0b4-2fdfc8873489@BlueTreble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've a client interested enough in $SUBJECT that they're willing to
offer a bounty on it. An example of the pain is (working example attached):
create temp view denorm as
select f.*, d1.t t1, d2.t t2
from fact f
left join dim d1 on f1=d1.id
left join dim d2 on f2=d2.id
;
-- Fast
explain analyze select count(*) from denorm where 1 in (f1,f2);
explain analyze select count(*) from denorm where '1' in (t1);
-- Slow
explain analyze select count(*) from denorm where '1' in (t1,t2);
They currently work around this by doing a union:
select ... from denorm where t1 = '1'
union
select ... from denorm where t2 = '1'
... or depending on needs using IN instead of =.
AFAICT this can be transformed into a UNION (not all) if dim.id is
unique. Does the upper planner pathification make this any easier?
There's another transform using arrays that's possible as well (see
attached example); I believe that would work regardless of uniqueness.
Just to be clear; the OR by itself is not a problem (as shown by the
first fast query); it's the OR with the JOIN that's a problem.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
Attachment | Content-Type | Size |
---|---|---|
test.sql | text/plain | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2017-02-08 22:16:02 | Re: Press Release Draft - 2016-02-09 Cumulative Update |
Previous Message | Andres Freund | 2017-02-08 21:59:11 | Re: WIP: About CMake v2 |