From: | Etienne Champetier <champetier(dot)etienne(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | OR vs UNION vs UNION ALL |
Date: | 2015-09-23 13:38:45 |
Message-ID: | CAOdf3gqf9Hhxk3g6qqkV6SkQEpM2JiARY45W0ATQ=JdqvCntLA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
I've reworked a bit an application to make 1 query (with many OR) instead
of thousands (2900 in this bench),
but was a bit disappointed with the perf. Here follow a quick bench and 2
questions at the end:
Here is my table
CREATE TABLE properties
(
item_id text NOT NULL,
calendar_id uuid NOT NULL,
clef text,
valeur text,
recurrence_id bigint,
CONSTRAINT fk_props_id FOREIGN KEY (item_id, calendar_id, recurrence_id)
REFERENCES cal_item (itemid, calendar_id, recurrence_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
)
Here are 3 versions of the "same" query, there is ~2900 conditions (OR,
UNION, UNION ALL), returning ~9000 lines (same exact number each time)
(i've ran the tests multiple time, using postgresql 9.3.5 on rhel 6, on an
idle physical server)
1)
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE
(ITEM_ID=((E'1263d4ac-d8a3-44d7-b971-5eee947ff36a')::text) AND
CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND
RECURRENCE_ID=((0)::int8)) OR
(ITEM_ID=((E'9a1e40bb-0280-4e79-ae13-ee57d614fdef')::text) AND
CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND
RECURRENCE_ID=((0)::int8)) OR
.....
total time: 7.6s
plan:
Index Cond,
Bitmap index scan on fk_props_id,
BitmapOr,
Recheck Cond,
Bitmap Heap Scan on properties
2)
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE
(ITEM_ID=((E'1263d4ac-d8a3-44d7-b971-5eee947ff36a')::text) AND
CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND
RECURRENCE_ID=((0)::int8)) UNION
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE
(ITEM_ID=((E'9a1e40bb-0280-4e79-ae13-ee57d614fdef')::text) AND
CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND
RECURRENCE_ID=((0)::int8)) UNION
...
total time: 1s
plan:
Index Cond,
Index Scan using fki_props_id on properties properties_XXX,
Append,
HashAggregate
3)
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE
(ITEM_ID=((E'1263d4ac-d8a3-44d7-b971-5eee947ff36a')::text) AND
CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND
RECURRENCE_ID=((0)::int8)) UNION ALL
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE
(ITEM_ID=((E'9a1e40bb-0280-4e79-ae13-ee57d614fdef')::text) AND
CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND
RECURRENCE_ID=((0)::int8)) UNION ALL
...
total time: 1.2s
plan:
Index Cond,
Index Scan using fki_props_id on properties properties_XXX,
Append
Two questions:
1) Is it normal to have such a big difference between OR and UNION and
should i always prefer UNION when possible?
2) How can UNION ALL be slower than UNION, it's not doing the HashAggregate
but is 0.2s slower?
Thanks in advance
Etienne
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Dik | 2015-09-23 14:29:24 | Re: The curious case of two inserts, a shrinking xmax, and a ShareLock on transaction |
Previous Message | Daniel Begin | 2015-09-23 11:37:22 | Re: Advise on memory usage limitation by PostgreSQL on Windows |