Re: At what point does a big table start becoming too big?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Nick <nboutelier(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: At what point does a big table start becoming too big?
Date: 2012-08-23 13:46:06
Message-ID: CAHyXU0yWrJEcd2rn_+9TEWHevQ=8njUcp_W0kRj0niKvJgSWUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Aug 22, 2012 at 6:06 PM, Nick <nboutelier(at)gmail(dot)com> wrote:
> I have a table with 40 million rows and haven't had any performance issues yet.
>
> Are there any rules of thumb as to when a table starts getting too big?
>
> For example, maybe if the index size is 6x the amount of ram, if the table is 10% of total disk space, etc?

Well, that begs the question: ...and do what? I guess you probably
mean partitioning.

Partitioning doesn't reduce index size -- it makes total index size
*bigger* since you have to duplicate higher nodes in the index --
unless you can exploit the table structure around the partition so
that less fields have to be indexed.

Where partitioning helps is by speeding certain classes of bulk
operations like deleting a bunch of rows -- maybe you can set it up so
that a partition can be dropped instead for a huge efficiency win.
Partitioning also helps by breaking up administrative operations such
as vacuum, analyze, cluster, create index, reindex, etc. So I'd argue
that it's time to start thinking about plan 'b' when you find yourself
getting concerned about performance of those operations.

Partitioning aside, the way to reduce the number of rows you're
dealing with is to explore reorganizing your data: classic
normalization or use of arrays are a couple of examples of things you
can try.

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2012-08-23 13:49:12 Re: Amazon High I/O instances
Previous Message Sébastien Lorion 2012-08-23 13:45:14 Re: Amazon High I/O instances