From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Joachim Wieland <joe(at)mcknight(dot)de> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel pg_dump for 9.1 |
Date: | 2010-03-29 15:40:26 |
Message-ID: | 603c8f071003290840wa8b25dfr8eecfdd4e81fd16c@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> - There are ideas on how to solve the issue with the consistent
> snapshot but in the end you can always solve it by stopping your
> application(s). I actually assume that whenever people are interested
> in a very fast dump, it is because they are doing some maintenance
> task (like migrating to a different server) that involves pg_dump. In
> these cases, they would stop their system anyway.
> Even if we had consistent snapshots in a future version, would we
> forbid people to run parallel dumps against old server versions? What
> I suggest is to just display a big warning if run against a server
> without consistent snapshot support (which currently is every
> version).
Seems reasonable.
> - Regarding the output of pg_dump I am proposing two solutions. The
> first one is to introduce a new archive type "directory" where each
> table and each blob is a file in a directory, similar to the
> experimental "files" archive type. Also the idea has come up that you
> should be able to specify multiple directories in order to make use of
> several physical disk drives. Thinking this further, in order to
> manage all the mess that you can create with this, every file of the
> same backup needs to have a unique identifier and pg_restore should
> have a check parameter that tells you if your backup directory is in a
> sane and complete state (think about moving a file from one backup
> directory to another one or trying to restore from two directories
> which are from different backup sets...).
I think that specifying several directories is a piece of complexity
that would be best left alone for a first version of this. But a
single directory with multiple files sounds pretty reasonable. Of
course we'll also need to support that format in non-parallel mode,
and in pg_restore.
> The second solution to the single-file-problem is to generate no
> output at all, i.e. whatever you export from your source database you
> import directly into your target database, which in the end turns out
> to be a parallel form of "pg_dump | psql".
This is a very interesting idea but you might want to get the other
thing merged first, as it's going to present a different set of
issues.
> I am currently not planning to make parallel dumps work with the
> custom format even though this would be possible if we changed the
> format to a certain degree.
I'm thinking we probably don't want to change the existing formats.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Jaime Casanova | 2010-03-29 15:42:33 | Re: enable_joinremoval |
Previous Message | Robert Haas | 2010-03-29 15:32:04 | Re: enable_joinremoval |