Re: Strategies/Best Practises Handling Large Tables

From: Chitra Creta <chitracreta(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Strategies/Best Practises Handling Large Tables
Date: 2012-10-21 12:46:55
Message-ID: CABkVLeObb-7xvwsFqKTcX3S3gcFCEsaWfG=JhAK1K-Pwa3YqZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you all.

Ryan, would you mind sharing your one-time function to move it?

Merlin, what are your suggestions to improve query performance?

Shaun, thank you. I will look into facts and dimensions should all else
fail.

Chris, would you mind giving me an example of what you mean by your log,
aggregate and snapshot approach. Also, with indexing, I believe composite
and partial indexes are better than indexes, am I correct? Do you have any
recommendations as to which type (e.g btree, hash) is better for which
situations.

Thank you,
Chitra

On Thu, Oct 18, 2012 at 12:47 AM, Chris Travers <chris(dot)travers(at)gmail(dot)com>wrote:

>
>
> On Fri, Oct 12, 2012 at 7:44 AM, Chitra Creta <chitracreta(at)gmail(dot)com>wrote:
>
>> Hi,
>>
>> I currently have a table that is growing very quickly - i.e 7 million
>> records in 5 days. This table acts as a placeholder for statistics, and
>> hence the records are merely inserted and never updated or deleted.
>>
>> Many queries are run on this table to obtain trend analysis. However,
>> these queries are now starting to take a very long time (hours) to execute
>> due to the size of the table.
>>
>> I have put indexes on this table, to no significant benefit. Some of the
>> other strategies I have thought of:
>> 1. Purge old data
>> 2. Reindex
>> 3. Partition
>> 4. Creation of daily, monthly, yearly summary tables that contains
>> aggregated data specific to the statistics required
>>
>> Does anyone know what is the best practice to handle this situation?
>>
>
> The answer is well, it depends. Possibly some combination.
>
> One approach I like that may be included in #4 but not necessarily is the
> idea of summary tables which contain snapshots of the data, allowing you to
> roll forward or backward from defined points. This is what I call the log,
> aggregate, and snapshot approach. But it really depends on what you are
> doing and there is no one size fits all approach at this volume.
>
> Instead of reindexing, I would suggest also looking into partial indexes.
>
> Best Wishes,
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Travers 2012-10-21 13:14:10 Re: Strategies/Best Practises Handling Large Tables
Previous Message John R Pierce 2012-10-21 05:00:12 Re: Create extension tablefunc fails with "wrong ELF class: ELFCLASS32"