From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Aram Fingal <fingal(at)multifactorial(dot)com> |
Cc: | Postgres-General General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Multiple indexes, huge table |
Date: | 2012-09-07 14:58:30 |
Message-ID: | CAHyXU0wcrH3CCxAKA_hqQUfGcy-MZo0DUu_TdYX+6v3W1pwNTQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Sep 6, 2012 at 4:22 PM, Aram Fingal <fingal(at)multifactorial(dot)com> wrote:
> I have a table which currently has about 500 million rows. For the most part, the situation is going to be that I will import a few hundred million more rows from text files once every few months but otherwise there won't be any insert, update or delete queries. I have created five indexes, some of them multi-column, which make a tremendous difference in performance for the statistical queries which I need to run frequently (seconds versus hours.) When adding data to the table, however, I have found that it is much faster to drop all the indexes, copy the data to the table and then create the indexes again (hours versus days.) So, my question is whether this is really the best way. Should I write a script which drops all the indexes, copies the data and then recreates the indexes or is there a better way to do this?
>
> There are also rare cases where I might want to make a correction. For example, one of the columns is sample name which is a foreign key to a samples table defined with " ON UPDATE CASCADE." I decided to change a sample name in the samples table which should affect about 20 million rows out of the previously mentioned 500 million. That query has now been running for five days and isn't finished yet.
Your case might do well with partitioning, particularly if you are
time bottlenecked during the import. It will require some careful
though before implementing, but the general schema is to insert the
new data into a child table that gets its own index: this prevents you
from having to reindex the whole table. Partitioning makes other
things more complicated though (like RI).
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Marti Raudsepp | 2012-09-07 15:15:37 | Re: Multiple indexes, huge table |
Previous Message | Willy-Bas Loos | 2012-09-07 09:10:06 | Re: return text from explain |