Re: UNION versus SUB SELECT

From: Robert DiFalco <robert(dot)difalco(at)gmail(dot)com>
To: desmodemone <desmodemone(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: UNION versus SUB SELECT
Date: 2013-11-21 21:12:36
Message-ID: CAAXGW-wEuC3EopsNgtGoA256sBGtHQ1mdcF0_zMEXevXaaZx6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry I couldn't get buffers to work but here is the explain analyze
verbose:

dft1fjfv106r48=> explain analyze verbose select c.*

from contacts c

where c.id IN (

select p.contact_id

from
phone_numbers p

where (p.national = 5038904993 and p.e164 = '+15038904993'))

or c.id IN (

select e.contact_id

from
email_addresses e

where e.email = 'robert(dot)xxxxx(at)gmail(dot)com')

;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=8.12..75.73 rows=1988 width=95)
(actual time=0.410..0.410 rows=0 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call,
c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
Rows Removed by Filter: 2849
SubPlan 1
-> Index Scan using idx_phone_address on public.phone_numbers p
(cost=0.06..4.06 rows=1 width=8) (actual time=0.015..0.015 rows=0 loops=1)
Output: p.contact_id
Index Cond: ((p."national" = 5038904993::bigint) AND
((p.e164)::text = '+15038904993'::text))
SubPlan 2
-> Index Scan using idx_email_address on public.email_addresses e
(cost=0.06..4.06 rows=1 width=8) (actual time=0.018..0.018 rows=0 loops=1)
Output: e.contact_id
Index Cond: ((e.email)::text = 'robert(dot)xxxxx(at)gmail(dot)com'::text)
Total runtime: 0.489 ms
(13 rows)

dft1fjfv106r48=> explain analyze verbose select c.*

from contacts c

where exists(

select 1

from
phone_numbers p

where (p.national = 5038904993 and p.e164 = '+15038904993') and
p.contact_id = c.id)
or EXISTS(

select 1

from
email_addresses e

where e.email = 'robert(dot)xxxxx(at)gmail(dot)com' and e.contact_id = c.id)

;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=0.00..21596.38 rows=1988 width=95)
(actual time=0.479..0.479 rows=0 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call,
c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives:
SubPlan 3 or hashed SubPlan 4))
Rows Removed by Filter: 2849
SubPlan 1
-> Index Scan using idx_phone_address on public.phone_numbers p
(cost=0.06..4.06 rows=1 width=0) (never executed)
Index Cond: ((p."national" = 5038904993::bigint) AND
((p.e164)::text = '+15038904993'::text))
Filter: (p.contact_id = c.id)
SubPlan 2
-> Index Scan using idx_phone_address on public.phone_numbers p_1
(cost=0.06..4.06 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1)
Output: p_1.contact_id
Index Cond: ((p_1."national" = 5038904993::bigint) AND
((p_1.e164)::text = '+15038904993'::text))
SubPlan 3
-> Index Scan using idx_email_address on public.email_addresses e
(cost=0.06..4.06 rows=1 width=0) (never executed)
Index Cond: ((e.email)::text = 'robert(dot)xxxxx(at)gmail(dot)com'::text)
Filter: (e.contact_id = c.id)
SubPlan 4
-> Index Scan using idx_email_address on public.email_addresses e_1
(cost=0.06..4.06 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=1)
Output: e_1.contact_id
Index Cond: ((e_1.email)::text = 'robert(dot)xxxxx(at)gmail(dot)com'::text)
Total runtime: 0.559 ms
(21 rows)

dft1fjfv106r48=> explain analyze verbose select * from contacts where id IN
(
(select c.id

from contacts c

join phone_numbers p on c.id =
p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993')
union (select
c.id

from contacts c

join email_addresses e on c.id = e.contact_id and e.email
= 'robert(dot)xxxxx(at)gmail(dot)com'));

QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=16.31..24.39 rows=2 width=95) (actual time=0.060..0.060
rows=0 loops=1)
Output: contacts.id, contacts.owner_id, contacts.user_id,
contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype,
contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname,
contacts.fb_id
-> Unique (cost=16.26..16.26 rows=2 width=8) (actual time=0.057..0.057
rows=0 loops=1)
Output: c.id
-> Sort (cost=16.26..16.26 rows=2 width=8) (actual
time=0.055..0.055 rows=0 loops=1)
Output: c.id
Sort Key: c.id
Sort Method: quicksort Memory: 25kB
-> Append (cost=0.11..16.25 rows=2 width=8) (actual
time=0.034..0.034 rows=0 loops=1)
-> Nested Loop (cost=0.11..8.12 rows=1 width=8)
(actual time=0.013..0.013 rows=0 loops=1)
Output: c.id
-> Index Scan using idx_phone_address on
public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual
time=0.011..0.011 rows=0 loops=1)
Output: p.id, p.contact_id, p."national",
p.e164, p.raw_number
Index Cond: ((p."national" =
5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text))
-> Index Only Scan using
idx_contacts_pkey_owner on public.contacts c (cost=0.06..4.06 rows=1
width=8) (never executed)
Output: c.id, c.owner_id, c.user_id
Index Cond: (c.id = p.contact_id)
Heap Fetches: 0
-> Nested Loop (cost=0.11..8.12 rows=1 width=8)
(actual time=0.018..0.018 rows=0 loops=1)
Output: c_1.id
-> Index Scan using idx_email_address on
public.email_addresses e (cost=0.06..4.06 rows=1 width=8) (actual
time=0.016..0.016 rows=0 loops=1)
Output: e.id, e.contact_id, e.email
Index Cond: ((e.email)::text = '
robert(dot)xxxxx(at)gmail(dot)com'::text)
-> Index Only Scan using
idx_contacts_pkey_owner on public.contacts c_1 (cost=0.06..4.06 rows=1
width=8) (never executed)
Output: c_1.id, c_1.owner_id, c_1.user_id
Index Cond: (c_1.id = e.contact_id)
Heap Fetches: 0
-> Index Scan using idx_contacts_pkey_owner on public.contacts
(cost=0.06..4.06 rows=1 width=95) (never executed)
Output: contacts.id, contacts.owner_id, contacts.user_id,
contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype,
contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname,
contacts.fb_id
Index Cond: (contacts.id = c.id)
Total runtime: 0.332 ms
(31 rows)

On Thu, Nov 21, 2013 at 12:38 PM, desmodemone <desmodemone(at)gmail(dot)com> wrote:

> Could you please attache the plan with explain buffers verbose?
>
> thank you
>
>
> 2013/11/21 Robert DiFalco <robert(dot)difalco(at)gmail(dot)com>
>
>> UNION and subselect both performed better than EXISTS for this particular
>> case.
>>
>>
>> On Thu, Nov 21, 2013 at 12:31 PM, desmodemone <desmodemone(at)gmail(dot)com>wrote:
>>
>>> Hi Robert, could you try with "exists" ?
>>>
>>> SELECT c.*
>>> FROM contacts c
>>> WHERE exists ( SELECT 1 FROM phone p WHERE p.addr =? and
>>> p.contact_id=c.id )
>>> OR exists (SELECT 1 FROM email e WHERE e.addr = ? and e.contact_id=
>>> c.id );
>>>
>>>
>>>
>>>
>>>
>>> 2013/11/21 Robert DiFalco <robert(dot)difalco(at)gmail(dot)com>
>>>
>>>> I have found this:
>>>>
>>>> SELECT c.*
>>>> FROM contacts c
>>>> WHERE c.id IN ( SELECT p.contact_id FROM phone p WHERE p.addr = ? )
>>>> OR c.id IN (SELECT e.contact_id FROM email e WHERE e.addr = ? );
>>>>
>>>> To have a worse plan than:
>>>>
>>>> SELECT * FROM contacts where id IN (
>>>> ( SELECT c.id FROM contacts c
>>>> JOIN phone p ON c.id = p.contact_id AND p.addr = ?
>>>> UNION
>>>> SELECT c.id FROM contacts c
>>>> JOIN email e ON c.id = e.contact_id AND e.addr = ? );
>>>>
>>>> Maybe this is no surprise. But after discovering this my question is
>>>> this, is there another option I dont' know about that is logically the same
>>>> that can perform even better than the UNION?
>>>>
>>>
>>>
>>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert DiFalco 2013-11-21 22:04:02 Re: UNION versus SUB SELECT
Previous Message desmodemone 2013-11-21 20:38:06 Re: UNION versus SUB SELECT