From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, Joachim Wieland <joe(at)mcknight(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Parallel pg_dump for 9.1 |
Date: | 2010-03-29 20:16:35 |
Message-ID: | 603c8f071003291316j63866a6j2298f4c03bbe6f12@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 29, 2010 at 4:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> On 3/29/10 7:46 AM, Joachim Wieland wrote:
>>> I actually assume that whenever people are interested
>>> in a very fast dump, it is because they are doing some maintenance
>>> task (like migrating to a different server) that involves pg_dump. In
>>> these cases, they would stop their system anyway.
>
>> Actually, I'd say that there's a broad set of cases of people who want
>> to do a parallel pg_dump while their system is active. Parallel pg_dump
>> on a stopped system will help some people (for migration, particularly)
>> but parallel pg_dump with snapshot cloning will help a lot more people.
>
> I doubt that. My thought about it is that parallel dump will suck
> enough resources from the source server, both disk and CPU, that you
> would never want to use it on a live production machine. Not even at
> 2am. And your proposed use case is hardly a "broad set" in any case.
> Thus, Joachim's approach seems perfectly sane from here. I certainly
> don't see that there's an argument for spending 10x more development
> effort to pick up such use cases.
>
> Another question that's worth asking is exactly what the use case would
> be for parallel pg_dump against a live server, whether the snapshots are
> synchronized or not. You will not be able to use that dump as a basis
> for PITR, so there is no practical way of incorporating any changes that
> occur after the dump begins. So what are you making it for? If it's a
> routine backup for disaster recovery, fine, but it's not apparent why
> you want max speed and to heck with live performance for that purpose.
> I think migration to a new server version (that's too incompatible for
> PITR or pg_migrate migration) is really the only likely use case.
It's completely possible that you could want to clone a server for dev
and have more CPU and I/O bandwidth available than can be efficiently
used by a non-parallel pg_dump. But certainly what Joachim is talking
about will be a good start. I think there is merit to the
synchronized snapshot stuff for pg_dump and perhaps other applications
as well, but I think Joachim's (well-taken) point is that we don't
have to treat it as a hard prerequisite.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-03-29 20:17:41 | Re: enable_joinremoval |
Previous Message | Simon Riggs | 2010-03-29 20:13:16 | Re: enable_joinremoval |