Index over only uncommon values in table

From: Steven Schlansker <steven(at)likeness(dot)com>
To: "PostgreSQL General" <pgsql-general(at)postgresql(dot)org>
Subject: Index over only uncommon values in table
Date: 2013-06-18 19:17:27
Message-ID: A0FF27CE-2ABC-43C2-88BE-3C284CD8E802@likeness.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi everyone,

I assume this is not easy with standard PG but I wanted to double check.

I have a column that has a very uneven distribution of values. ~95% of the values will be the same, with some long tail of another few dozens of values.

I want to have an index over this value. Queries that select the most common value will not use the index, because it is a overwhelming percentage of the table. This means that ~95% of the disk space and IOPS to maintain the index is "wasted".

I cannot use a hardcoded partial index because:
1) The common value is not known at schema definition time, and may change (very slowly) over time.
2) JDBC uses prepared statements for everything, and the value to be selected is not known at statement prepare time, so any partial indices are ignored (this is a really really obnoxious behavior and makes partial indices almost useless combined with prepared statements, sadly…)

The table size is expected to approach the 0.5 billion row mark within the next few months, hence my eagerness to save even seemingly small amounts of per-row costs.

Curious if anyone has a good way to approach this problem.
Thanks,
Steven

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2013-06-18 19:23:26 Re: Index over only uncommon values in table
Previous Message John R Pierce 2013-06-18 19:09:59 Re: earthdistance compass bearing