Re: Better index stategy for many fields with few values

From: Oscar Picasso <oscgoogle(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Better index stategy for many fields with few values
Date: 2006-04-13 15:40:06
Message-ID: 20060413154006.46851.qmail@web54710.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Markus,

Markus Schaber <schabi(at)logix-tt(dot)com> wrote:

>Hi, Oscar,
>
>Please reply to the list and not privately, so others can learn from
>your replies, and possibly have better Ideas than me.

That was my intention. I made a mistake.

>Oscar Picasso wrote:
>
>> I cannot group the columns logically. Any column may or may not appear
>> in a query.
>
>That's suboptimal.
>
>> Summrarizing what I have learned:
>> - I cannot use multicolumn indexes because I cannot group the column
>> logically.
>> - I cannot use funtional indexes
>> - I cannot use clustering.
>
>You still can have a set of partitioned multi-column indices,
>overlapping enough that every combination of columns is covered (or risk
>a sequential sub scan for the last two or three columns, this should not
>hurt too much if the first 17 columns were selective enough).
>
>The main problem with indices is that they also decrease write performance.
>
>If disk costs are not limited, it will make sense to have WAL, table and
>indices on different disks / raid arrays, to parallelize writes.
>
>Btw, I guess you have multiple, concurrent users?

Yes I do.

I have just made other tests with only the individual indexes and performance is much better than previously. Obviously there was an I/O problem during my initial test.

Something interesting though. If I use few columns in the query the results come very quickly and pg does a sequential scan.

When it reachs some threshold (4 or 5 columns) pg switches to bitmap scans. It then takes an almost constant time (~ 2500 ms) not matter how many more columns I add to the where clause.

Interestingly enough, queries with many columns are less common. They also return less results and even many times no result at all.

From the user point of view it would be nice to have a waiting time lower than 2500ms for these queries. Maybe I could achieve that goal simply by tuning postgresql. In a such case where should I look first in order to increase bitmap scanning?

Maybe I could, that way, avoid the use of partitioned multi-column indexes.

Oscar


---------------------------------
Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1&cent;/min.

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-04-13 15:50:08 Re: index is not used if I include a function that returns current time in my query
Previous Message Tom Lane 2006-04-13 15:26:50 Re: Slow query - possible bug?