Re: pg_dump dump catalog ACLs

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Jose Luis Tallon <jltallon(at)adv-solutions(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump dump catalog ACLs
Date: 2016-05-04 12:14:55
Message-ID: 20160504121455.GV10850@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah,

* Noah Misch (noah(at)leadboat(dot)com) wrote:
> On Mon, Apr 25, 2016 at 12:39:09AM -0400, Stephen Frost wrote:
> > * Noah Misch (noah(at)leadboat(dot)com) wrote:
> > > On Fri, Apr 22, 2016 at 12:31:41PM -0400, Stephen Frost wrote:
> > > > After looking through the code a bit, I realized that there are a lot of
> > > > object types which don't have ACLs at all but which exist in pg_catalog
> > > > and were being analyzed because the bitmask for pg_catalog included ACLs
> > > > and therefore was non-zero.
> > > >
> > > > Clearing that bit for object types which don't have ACLs improved the
> > > > performance for empty databases quite a bit (from about 3s to a bit
> > > > under 1s on my laptop). That's a 42-line patch, with comment lines
> > > > being half of that, which I'll push once I've looked into the other
> > > > concerns which were brought up on this thread.
> > >
> > > That's good news.
> >
> > Attached patch-set includes this change in patch #2.
>
> Timings for the 100-database pg_dumpall:
>
> HEAD: 131s
> HEAD+patch: 33s
> 9.5: 8.6s
>
> Nice improvement for such a simple patch.

Patch #2 in the attached patchset includes that improvement and a
further one which returns the performance to very close to 9.5. This is
done by checking for any changed ACLs for a given table (the table-level
and column-level ones) and removing the ACL component if there haven't
been any changes.

> > Patch #1 is the fix for the incorrect WHERE clause.
>
> I confirmed that this fixed dumping of the function ACL in my test case.

This patch is unchanged.

> > For languages, I believe that means that we simply need to modify the
> > selectDumpableProcLang() function to use the same default we use for the
> > 'pg_catalog' namespace- DUMP_COMPONENT_ACL, instead of
> > DUMP_COMPONENT_NONE.
>
> Makes sense.

This is now being tested in the TAP testing for pg_dump that I've been
working on (patch #3). Fixing this issue was a bit more complicated
because of cases like plpgsql, which is a from-initdb extension, but we
still want to dump out any changes to plpgsql that the user has made, so
we have to dump ACLs for from-initdb extension objects, if they've been
changed.

Those fixes are included in patch #2 in the patchset.

> > What's not clear to me is what, if any, issue there is with namespaces.
>
> As far as I know, none. The current behavior doesn't match the commit
> message, but I think the current behavior is better.

Great. No changes have been made to how namespaces are handled.

> > Certainly, in my testing at least, if you do:
> >
> > REVOKE CREATE ON SCHEMA public FROM public;
> >
> > Then you get the appropriate commands from pg_dump to implement the
> > resulting ACLs on the public schema. If you change the permissions back
> > to match what is there at initdb-time (or you just don't change them),
> > then there aren't any GRANT or REVOKE commands from pg_dump, but that's
> > correct, isn't it?
>
> Good question. I think it's fine and possibly even optimal. One can imagine
> other designs that remember whether any GRANT or REVOKE has happened since
> initdb, but I see no definite reason to prefer that alternative.

Neither do I.

> > In the attached patch-set, patch #3 includes support for
> >
> > src/bin/pg_dump: make check
> >
> > implemented using the TAP testing system. There are a total of 360
> > tests, generally covering:
>
> I like the structure of this test suite.

Great. I've added a whole ton more tests, added more options to various
pg_dump runs to test more options with fewer runs (though there are
still quite a few...) and added more objects to be created. Also added
a bunch of comments to describe how the test suite is set up.

> > +# Start with 1 because of non-existant database test below
> > +# Test connecting to a non-existant database
>
> Spelling.

Fixed.

The test suite is now covering 57% of pg_dump.c. I was hoping to get
that number higher, but time marches on and more tests can certainly be
added later.

I'm planning to do another review of this patch-set and do testing
against back-branches back to 7.4. Barring any issues or objections,
I'll commit this tomorrow to address the performance concerns and to add
in the test suite.

On my system, the test suite takes about 15 seconds to run, which
includes setting up the cluster, creating all the objects, and then
running the pg_dump runs and checking the results. All told, in those
15 seconds, 1,213 tests are run.

Thanks!

Stephen

Attachment Content-Type Size
pg_dump_improvements_v2.patch text/x-diff 75.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-05-04 12:23:38 Re: psql :: support for \ev viewname and \sv viewname
Previous Message Alexander Korotkov 2016-05-04 11:36:29 Re: atomic pin/unpin causing errors