Re: deparsing utility commands

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: deparsing utility commands
Date: 2015-02-24 12:26:35
Message-ID: 20150224122635.GL29780@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> Hi,
>
> On 2015-02-23 19:48:43 -0500, Stephen Frost wrote:
> > > Yes, it might be possible to use the same code for a bunch of minor
> > > commands, but not for the interesting/complex stuff.
> >
> > We can clearly rebuild at least CREATE commands for all objects without
> > access to the parse tree, obviously pg_dump manages somehow.
>
> That's not the same. pg_dump recreates a static state, not running log
> of changes that can be replayed.

There isn't anything particularly special about these function, that I
can see at least, when it comes to a 'running log' vs. 'static state'.

> > I didn't specify that a single command had to be used.
>
> Well, if you want to get 'the create table statement' you'll need to
> decide what you want. With ddl event triggers it's clear - something
> that, when replayed, results in the same catalog contents. Such an easy
> definition doesn't, as far as I know, exist for a set of functions you
> seem to imagine.

This is really an orthogonal consideration. I'd draw a parallel to
CREATE TABLE .. LIKE, where you have to specify what you want. Perhaps
that's reason enough for this initial version to not be exposed out to
psql, but it doesn't change the point that going based off of the
catalog and not the parsetree would facilitate this capability.
Requiring the parsetree precludes any progress in this direction.

> > Further, the actual deparse_CreateStmt doesn't look like it'd have a
> > terribly hard time producing something without access to the
> > parsetree.
>
> The interesting bit isn't the deparse_CreateStmt itself, but all the
> stuff that's also triggered when you do a CREATE TABLE
> nontrival-stuff;. utility.c will first transform the statement and then
> run run and stash every created subcommand. And the subcommands will
> *also* get deparsed.

Right, all of that is completely independent of deparse_CreateStmt and
there's nothing in your argument for why deparse_CreateStmt should get
access to or need the parsetree. If I'm following correctly, your
argument is that we shouldn't care that deparse_CreateStmt gets the
parsetree because of how it's getting called and the argument I'm making
is that deparse_CreateStmt really doesn't need the parsetree and if it
wasn't required then deparse_CreateStmt could serve additional
use-cases.

> Just think about what to do about CREATE TABLE foo(id serial primary
> key, data text, bar_id REFERENCES foo.bar); - there's no way you can get
> which other objects to dump from the catalog alone. What for a schema,
> considering CREATE SCHEMA ... (schema_elements)+;?
>
> Sure, individual subcommands could refer to the catalog instead of the
> parse tree. But to get the whole thing you can't easily just refer to
> it.

I'm not entirely following this last sentence, but I'm encouragerd by
the agreement (if I understand correctly) that the subcommands could use
the catalog instead of the parsetree. The question of if that makes
them useful for other callers is perhaps up for some debate, but they
certainly look valuable from here. They may need to get extended to
have other options passed in (include defaults, include constraints,
etc) which then cause other sub-commands to be called (or perhaps
there's a larger function overall which is called and handles those
pieces and combines the results and the sub-command remains the same)
but all of that is good and interesting discussion which can happen if
we write these functions without the parsetree. With the parsetree
requirement, we can't really even have those discussions.

> > All I'm suggesting is that we focus on collecting the information from
> > the catalog and avoid using the parsetree. For at least the CREATE and
> > DROP cases, that should be entirely possible.
>
> DROP's already in 9.4, the additions in 9.5 were more or less usability
> things. The problem generating DROPs is right now more identifying
> which one you want to drop and checking the dependencies - the latter
> I'm not sure how to do without actually executing the DROP.

I'm a bit confused by this as executing the DROP is going to follow the
entries in pg_depend, which we could do also.. What am I missing there?

> > > I think this is a different feature that might end up sharing some of
> > > the infrastructure, but not more.
> >
> > I know you've looked through this code also and I really don't get the
> > comment that only "some" of this infrastructure would be shared. As I
> > tried to point out, for the 'CREATE POLICY' case, a few minor
> > modifications would have it provide exactly what I'm suggesting and I'm
> > sure that most of the cases would be similar. Simply looking through
> > the code with an eye towards avoiding the parsetree in favor of pulling
> > information from the catalog would illustrate the point I'm making, I
> > believe.
>
> I've no problem at all changing CREATE POLICY (and some other) deparse
> functions to look more at the catalog than the command. What I don't see
> is changing all of the create deparse functions, guarantee that they are
> usable for getting the DDL of catalog objects and provide SQL
> accessible infrastructure for that. That's a *massive* undertaking.

Perhaps we can agree to start with the first sentence in this paragraph
then? I'm not against waiting for the other pieces until we've had more
time to properly have that discussion.

> What I mean with 'sharing some of the infrastructure' is that I can see
> a good portion of the deparse_* functions being reused for what you
> desire.
>
> But the decision about which of those to call will be an entirely
> separate project. So is a whole new infrastructure to consider locking
> and visibility (all the deparse stuff uses continualy evolving catalog
> snapshots!) that it'll need as that's a problem the event trigger stuff
> has much less to care about, because the objects are new rows and thus
> can't be updated by other backends.

For my part, at least, I'd want my current backend to show the tables
based on what has already happened and not based on some point prior, so
either this is exactly what is wanted or I've misunderstood your point
here. I realize that the same isn't exactly true for pg_dump, but
that's also not the use-case I was suggesting (though I do admit that it
might be interesting to figure out if there's a way for pg_dump to make
use of this, but that's an even farther afield discussion).

> > > I don't think it's all that related, so I'm not surprised. If you
> > > execute a CREATE TABLE(id serial primary key); you'll get a bunch of
> > > commands with this facility - it'd be much less clear what exactly shall
> > > be dumped in the case of some hypothetical GUI when you want to dump the
> > > table.
> >
> > I really don't think it's all that strange or complicated to consider
> > and we've got a rather nice example of what a good approach would be.
>
> Right. We got a *massive* program that solves dependencies and doesn't
> do all that much useful/correct things if you only let it dump
> individual objects. And that dumping of individual objects is what you
> want... ;)

"Dumping" as in pg_dump-style might be a bit beyond what I'm thinking
about at the moment. Regardless, as I've tried to point out above, the
changes which I'm actually suggesting for this initial body of work are
just to avoid the parsetree and go based off of what the catalog has.
I'm hopeful that's a small enough and reasonable enough change to happen
during this commitfest. I don't have any issue tabling the rest of the
discussion and future work based on that to a later time.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-02-24 12:44:17 Re: A typo in backend/replication/README
Previous Message Heikki Linnakangas 2015-02-24 12:24:32 contrib/fuzzystrmatch/dmetaphone.c license