From: | Martin Pitt <mpitt(at)debian(dot)org> |
---|---|
To: | rtvd(at)me(dot)com |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #7884: pg_upgradecluster is terribly slow |
Date: | 2013-02-15 18:29:59 |
Message-ID: | 20130215182959.GA2531@piware.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello,
rtvd(at)me(dot)com [2013-02-15 17:41 +0000]:
> I have tried upgrading a database from 8.4 to 9.1 and pg_upgradecluster is
> terribly slow.
Please note that this is a Debian/Ubuntu specific bug,
pg_upgradecluster is not shipped by upstream PostgreSQL.
I admittedly don't have any first-hand experience with migrating large
databases, and get very little feedback about it, so thanks for your
observations!
First of all, Peter has worked on a branch that makes
pg_upgradecluster use pg_upgrade, which hopefully will be a lot more
performant.
> 1. It uses "-Fc". This is bad as it enables compression by default. Which is
> pointless and wastes CPU power.
The tar format (-Ft) has also been supported since at least 8.4
(that's the earliest version which is still supported in
Debian/Ubuntu), and documentation says it's not compressed. If you
replace -Fc with -Ft, do you see a significant performance increase?
> 2. It seems to migrate the schema first and only then it moves data. This is
> bad as indices are being re-built when the data is finally poured in.
At least in earlier versions pg_dumpall wasn't able to dump BLOBs, and
the manpage doesn't indicate otherwise for 9.2 either. AFAIK this just
supports the plain SQL text format. I don't believe pg_dumpall would
avoid the rebuilding of incides?
> 3. The migration does not happen within a transaction.
> 1. migrate the schema
pg_upgradecluster does that, too.
> 2. drop all databases (leaving roles in)
> 3. re-creating the databases (without any content)
Out of interest, why does that help?
> 4. for each database run pg_dump <database> | psql --single-transaction
> <database>
As written above, I don't want to use the text format and psql, but
pg_restore also supports --single-transaction since at least 8.4.
The question which I'm not sure about is whether it's ok to use
--single-transaction even for very large databases. I. e. is piling up
gigabytes of data in a transaction and committing it all in the end
always more efficient than the default mode (which I assume will use
one transaction by row)? Is that a safe thing to do, or could one run
into out-of-memory conditions?
http://people.canonical.com/~pitti/tmp/pg_upgradecluster is a version
with these two changes: pg_restore --single-transaction and using the
tar format. Perhaps you can try this, and compare performance?
Thanks,
Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
From | Date | Subject | |
---|---|---|---|
Next Message | david.thomas | 2013-02-15 18:33:21 | BUG #7885: postmaster panic on startup does not release shared memory |
Previous Message | Kevin Grittner | 2013-02-15 18:27:47 | Re: BUG #7884: pg_upgradecluster is terribly slow |