Re: Alternatives to very large tables with many performance-killing indicies?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Wells Oliver <wellsoliver(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Alternatives to very large tables with many performance-killing indicies?
Date: 2012-08-23 07:09:46
Message-ID: 20120823070946.GA18544@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Aug 22, 2012 at 05:56:27PM -0700, Jeff Janes wrote:
> > It's wide-ish, too, 98 columns.
>
> How many of the columns are NULL for any given row? Or perhaps
> better, what is the distribution of values for any given column? For
> a given column, is there some magic value (NULL, 0, 1, -1, 9999, '')
> which most of the rows have?

In particular, if the data is sparse, as in lots of NULLs, and you
don't need to search on those, you might consider partial indexes. If
you create partial indexes for only the non-NULL entries, postgres is
smart enough to use it when you query it for something not NULL.
Example:

db=# create temp table foo (a int4, b int4);
CREATE TABLE
db=# insert into foo (a) select generate_series(1,100000);
INSERT 0 100000
db=# update foo set b=1 where a=1;
UPDATE 1
db=# create index bar on foo(b) where b is not null;
CREATE INDEX
db=# explain select * from foo where b=1;
QUERY PLAN
--------------------------------------------------------------------
Bitmap Heap Scan on foo (cost=4.38..424.59 rows=500 width=8)
Recheck Cond: (b = 1)
-> Bitmap Index Scan on bar (cost=0.00..4.26 rows=500 width=0)
Index Cond: (b = 1)
(4 rows)

In this case a row update will only update indexes with non-NULL rows,
which may cut the overhead considerably.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martin French 2012-08-23 07:22:08 Re: At what point does a big table start becoming too big?
Previous Message Gavin Flower 2012-08-23 07:04:02 Re: At what point does a big table start becoming too big?