Re: Config for fast huge cascaded updates

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Config for fast huge cascaded updates
Date: 2017-06-27 01:29:59
Message-ID: 20170627012959.a4b32xfanjgqq76y@crankycanuck.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jun 27, 2017 at 10:17:49AM +1200, Craig de Stigter wrote:
> We're doing a large migration on our site which involves changing most of
> the primary key values. We've noticed this is a *very* slow process.

Indeed.

Does the database need to be online when this is happening?

If it were me, I'd try to find a way to dump it, modify the data in a
dump file, and then reload it. I think that'd be faster.

Another way you might try, if you need to be online while doing this,
is to write the whole thing into a new SQL schema. Make the mods you
need. When you think you're close to done, put a trigger in the "old
schema" to update data in the new schema, then do a last pass to catch
anything you missed in the interim, then cut your application over to
the new schema (update the search_path, force everything to disconnect
and reconnect, and when they reconnect they have the new data in
place). A variation on this technique is also useful for gradual
roll-out of new features -- you don't have to upgrade everything at
once and you have a natural rollback strategy (but you need a more
complicated set of triggers that keeps the two schemas in sync during
cutover period).

This second approach isn't faster, it's hard on I/O and disk space,
but it keeps you up and you can do the changes at a leisurely pace.
Just make sure you have the I/O and space before you do it :)

Hope that helps,

A

--
Andrew Sullivan
ajs(at)crankycanuck(dot)ca

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2017-06-27 02:26:08 Re: Config for fast huge cascaded updates
Previous Message David G. Johnston 2017-06-27 01:11:14 Re: LEFT JOIN, entry can not be referenced