From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | pgsql-patches(at)postgresql(dot)org |
Subject: | [Fwd: Re: [HACKERS] pg_dump additional options for performance] |
Date: | 2008-05-07 14:37:09 |
Message-ID: | 1210171029.4268.115.camel@ebony.site |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Re-sending post as discussed with Bruce...
On Sun, 2008-03-23 at 12:45 -0300, Alvaro Herrera wrote:
> Bruce Momjian wrote:
> >
> > Added to TODO:
> >
> > o Allow pre/data/post files when dumping a single object, for
> > performance reasons
> >
> > http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
>
> "When dumping a single object"?? Do you mean database?
It would be for whatever set of objects are specified through the use of
databases, table include/exclude switches.
I've written a patch that implements these new switches on the commands
as shown
pg_dump --schema-pre-load
pg_dump --schema-post-load
pg_restore --schema-pre-load
pg_restore --schema-post-load
I have not implemented --schema-pre-file=xxx style because they don't
make any sense when using pg_restore in direct database connection mode.
On reflection I don't see any particular need to produce multiple files
as output, which just complicates an already horrendous user interface.
This is a minimal set of changes and includes nothing at all about
directories, parallelisation in the code etc..
This has the following use cases amongst others...
* dump everything to a file, then use pg_restore first --schema-pre-load
and then --data-only directly into the database, then pg_restore
--schema-post-load to a file so we can edit that file into multiple
pieces to allow index creation in parallel
* dump of database into multiple files by manually specifying which
tables go where, then reload in parallel using multiple psql sessions
The patch tests OK after some testing, though without a test suite that
probably isn't more than a few percentage points of all the possible
code paths. There are no docs for it, as yet.
---
Further thinking on this....
Some further refinement might replace --data-only and --schema-only with
--want-schema-pre
--want-data
--want-schema-post
--want-schema (same as --want-schema-pre --want-schema-post)
These could be used together e.g. --want-schema-pre --want-data
whereas the existing --data-only type switches cannot.
Which would be a straightforward and useful change to the enclosed
patch.
That way of doing things is hierarchically extensible to include further
subdivisions of the set of SQL commands produced, e.g. divide
--want-post-schema into objects required to support various inter-table
dependencies and those that don't such as additional indexes. I don't
personally think we need that though.
Comments?
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
Attachment | Content-Type | Size |
---|---|---|
pg_dump_prepost.v1.patch | text/x-patch | 29.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2008-05-07 14:47:36 | Re: Posting to hackers and patches lists |
Previous Message | Bruce Momjian | 2008-05-07 14:28:39 | Re: Posting to hackers and patches lists |