Re: Suggested "easy" TODO: pg_dump --from-list

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Suggested "easy" TODO: pg_dump --from-list
Date: 2010-11-24 14:52:28
Message-ID: 17621.1290610348@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Joachim Wieland <joe(at)mcknight(dot)de> writes:
> On Wed, Nov 24, 2010 at 1:15 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Nope ... those strings are just helpful comments, they aren't really
>> guaranteed to be unique identifiers. In any case, it seems unlikely
>> that a user could expect to get the more complicated cases exactly right
>> other than by consulting "pg_dump | pg_restore -l" output. Which makes
>> the use-case kind of dubious to me.

> In which case would the catalogId, i.e. (tableoid, oid) not be unique?

Catalog OID + object OID would be unique, but surely we don't want to
make users deal in specifying the objects to be dumped with that.

Actually, what occurs to me to wonder is whether the facility has to be
guaranteed unique at all. If for instance you have a group of overloaded
functions, is there really a big use-case for dumping just one and not
the whole group? Even if you think there's some use for it, is it big
enough to justify a quantum jump in the complexity of the feature?

Here's a radically simplified proposal: provide a switch
--object-name=pattern
where pattern follows the same rules as in psql \d commands (just
to use something users will already know). Dump every object,
of any type, whose qualified name matches the pattern, ie the same
objects that would be shown by \d (of the relevant type) using the
pattern. Accept multiple occurrences of the switch and dump the
union of the matched objects.

(Now that I think about it, this is the same as the existing --table
switch, just generalized to match any object type.)

There would be some cases where this'd dump more than you really want,
but I think it'd make up for that in ease-of-use. It's not clear to me
that dumping a few extra objects is a big problem except for the case
where the objects are large tables, and in that case if you aren't
specifying a sufficiently exact pattern, it's your own fault not a
limitation of the feature.

BTW, what about dependencies? One of the main complaints we've heard
about pg_restore's filtering features is that they are not smart about
including things like the indexes of a selected table, or the objects it
depends on (eg, functions referred to in CHECK constraints). I'm not
sure that a pure name-based filter will be any more usable than
pg_restore's filter, if there is no accounting for dependencies.
The risk of not including dependencies at dump time is vastly higher
than in pg_restore, too, since by the time you realize you omitted
something critical it may be too late to go back and get another dump.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-11-24 15:19:30 Re: Suggested "easy" TODO: pg_dump --from-list
Previous Message Matteo Beccati 2010-11-24 14:51:55 Re: Re: Mailing list archives