Re: How to optimize WHERE column_a IS NOT NULL OR column_b = 'value'

From: Igor Neyman <ineyman(at)perceptron(dot)com>
To: Andreas Joseph Krogh <andreas(at)visena(dot)com>, "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: How to optimize WHERE column_a IS NOT NULL OR column_b = 'value'
Date: 2014-08-11 18:38:14
Message-ID: A76B25F2823E954C9E45E32FA49D70EC919DE754@mail.corp.perceptron.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

From: pgsql-sql-owner(at)postgresql(dot)org [mailto:pgsql-sql-owner(at)postgresql(dot)org] On Behalf Of Andreas Joseph Krogh
Sent: Monday, August 11, 2014 12:01 PM
To: pgsql-sql(at)postgresql(dot)org
Subject: [SQL] How to optimize WHERE column_a IS NOT NULL OR column_b = 'value'

Hi folks,

I have the following schema (simplified for this example).

create table folder(id integer primary key, name varchar not null);

create table document(id serial primary key, name varchar not null, owner_id integer not null, folder_id integer references folder(id));
create index document_owner_idx ON document(owner_id);
create index document_folder_idx ON document(folder_id);

insert into folder(id, name) values(1, 'Folder A');
insert into folder(id, name) values(2, 'Folder B');
insert into document(name, owner_id, folder_id) values('Document A', 1, 1);
insert into document(name, owner_id, folder_id) values('Document B', 1, NULL);
insert into document(name, owner_id, folder_id) values('Document C', 2, 2);
insert into document(name, owner_id, folder_id) values('Document D', 2, NULL);

select f.id, f.name, doc.id, doc.owner_id, doc.name
FROM document doc left outer join folder f ON doc.folder_id = f.id
WHERE doc.folder_id is not null OR doc.owner_id = 1;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.15..13.77 rows=4 width=76) (actual time=0.031..0.045 rows=3 loops=1)
-> Seq Scan on document doc (cost=0.00..1.05 rows=4 width=44) (actual time=0.012..0.018 rows=3 loops=1)
Filter: ((folder_id IS NOT NULL) OR (owner_id = 1))
Rows Removed by Filter: 1
-> Index Scan using folder_pkey on folder f (cost=0.15..3.17 rows=1 width=36) (actual time=0.005..0.006 rows=1 loops=3)
Index Cond: (doc.folder_id = id)
Planning time: 0.267 ms
Execution time: 0.094 ms
(8 rows)

Is the a way to write a query which uses an index efficiently for such a schema?

I'd like to eliminate the Filter: ((folder_id IS NOT NULL) OR (owner_id = 1)) and rather have "index cond" insted, is that possible?

--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas(at)visena(dot)com<mailto:andreas(at)visena(dot)com>
www.visena.com<https://www.visena.com>
[cid:image001(dot)png(at)01CFB571(dot)E24DB8B0]<https://www.visena.com>

You could try partial index on (column_b) where column_a IS NULL.

Regards,
Igor Neyman

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Adrian Klaver 2014-08-11 19:01:20 Re: function call
Previous Message David G Johnston 2014-08-11 16:13:27 Re: function call