Re: Index on multiple columns VS multiple index

From: "Jonathan S(dot) Katz" <jonathan(dot)katz(at)excoventures(dot)com>
To: Brice André <brice(at)famille-andre(dot)be>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Index on multiple columns VS multiple index
Date: 2014-01-02 20:05:11
Message-ID: F430EDB0-191D-4CB9-9C2A-C967AFCF8772@excoventures.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi Brice,

On Jan 2, 2014, at 2:45 PM, Brice André wrote:

> Sorry, it's probably my bad english that is confusing. When I sayed inequality, I ment operators like <= or >=. In fact, 'a' is an integer' which is a foreign key on a primary key of another table and b is a timestamp.
>
> So, if I understand you, making an index on both ('a', 'b') will be faster than two separate indices ?
>
> But, if yes, does this index can be useful for a search on 'a' only ? Or do I need a separate index for this ?
>
> I generally also have a ORDER clause on 'b'. I suppose that the index will be a good point for it too ?

It should be more space-efficient to use a multi-column index, but you would have similar performance on SELECTs as having two indexes.

But now understanding your data a bit more, it probably would be better just to have an index on "a" - unless you will be searching only over "b" or even after filtering your data by "a" you will have a lot of rows that need to be filtered by "b" having a multi-column index or an index on "b" would be probably be overkill.

Of course, I use "a lot" because it really depends on what your actual data is - if you can you should probably run a few scenarios with your data set (if you can) and using EXPLAIN ANALYZE to see which index or indexes actually are used.

Keep in mind that you have to consider what happens when you have writes (INSERT/UPDATE/DELETE) on the table with your indexes, your write queries will have to wait for those indexes to be updated, thus putting more I/O load on the system.

> And, last question, I also have time-consuming queries that are of the form :
>
> SELECT .. FROM table WHERE 'a'=x AND 'c'=y AND 'b' >= z
>
> where 'c' is an integer, but that is not a foreign key. Does it makes sense to create an additional multi-column index on ('a', 'b', 'c') ? Does the order of declaration of columns in the index creation makes a difference ? (for example ('a', 'c', 'b')) ? And is this index useful for a search on 'a' and 'b' only ?

Probably not, unless you have a lot of rows to further filter from "c" - too much indexing could actually impede performance, which is why you need to experiment a bit :-)

Best,

Jonathan

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Jonathan S. Katz 2014-01-02 20:06:28 Re: Index on multiple columns VS multiple index
Previous Message Brice André 2014-01-02 20:00:21 Re: Index on multiple columns VS multiple index