From: | Greg Smith <greg(at)2ndquadrant(dot)com> |
---|---|
To: | Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Massive table (500M rows) update nightmare |
Date: | 2010-01-08 18:12:00 |
Message-ID: | 4B477570.4080804@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Carlo Stonebanks wrote:
> This is hosted on a new server the client set up so I am waiting for
> the exact OS and hardware config. PG Version is PostgreSQL 8.3.6,
> compiled by Visual C++ build 1400, OS appears to be Windows 2003 x64
> Server.
>
> More than anything, I am more concerned with the long-term use of the
> system. This particular challenge with the 500M row update is one
> thing, but I am concerned about the exceptional effort required to do
> this. Is it REALLY this exceptional to want to update 500M rows of
> data in this day and age? Or is the fact that we are considering
> dumping and restoring and dropping indexes, etc to do all an early
> warning that we don't have a solution that is scaled to the problem?
It's certainly not common or easy to handle. If someone told me I had
to make that task well perform well and the tools at hand were anything
other than a largish UNIX-ish server with a properly designed disk
subsystem, I'd tell them it's unlikely to work well. An UPDATE is the
most intensive single operation you can do in PostgreSQL; the full
lifecycle of executing it requires:
-Creating a whole new row
-Updating all the indexes to point to the new row
-Marking the original row dead
-Pruning the original row out of the database once it's no longer
visible anywhere (VACUUM)
As a rule, if you can't fit a large chunk of the indexes involved in
shared_buffers on your system, this is probably going to have terrible
performance. And you're on a platform where that's very hard to do,
even if there's a lot of RAM around. It sounds like your system spends
all its time swapping index blocks in and out of the database buffer
cache here. That suggests there's a design problem here either with
those indexes (they're too big) or with the provisioned
hardware/software combination (needs more RAM, faster disks, or a
platform where large amounts of RAM work better).
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2010-01-08 18:12:40 | Re: Massive table (500M rows) update nightmare |
Previous Message | Kenneth Marshall | 2010-01-08 18:08:36 | Re: Massive table (500M rows) update nightmare |