Re: our checks for read-only queries are not great

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: our checks for read-only queries are not great
Date: 2020-01-10 20:22:26
Message-ID: 20200110202226.GA3195@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Fri, Jan 10, 2020 at 9:30 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > If somebody comes up with a patch
> > that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
> > whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
> > will you then decide that ALTER SYSTEM SET is no longer read-only?
> > Or, perhaps, reject such a patch on the grounds that it breaks this
> > arbitrary definition of read-only-ness?
>
> I would vote to reject such a patch as a confused muddle. I mean,
> generally, the expectation right now is that if you move your data
> from the current cluster to a new one by pg_dump, pg_upgrade, or even
> by promoting a standby, you're responsible for making sure that
> postgresql.conf and postgresql.auto.conf get copied over separately.

I really don't like that the ALTER SYSTEM SET/postgresql.auto.conf stuff
has to be handled in some way external to logical export/import, or
external to pg_upgrade (particularly in link mode), so I generally see
where Tom's coming from with that suggestion.

In general, I don't think people should be expected to hand-muck around
with anything in the data directory.

> In the last case, the backup that created the standby will have copied
> the postgresql.conf from the master as it existed at that time, but
> propagating any subsequent changes is up to you. Now, if we now decide
> to shove ALTER SYSTEM SET commands into pg_dumpall output, then
> suddenly you're changing that rule, and it's not very clear what the
> new rule is.

I'd like a rule of "users don't muck with the data directory", and we
are nearly there when you include sensible packaging such as what Debian
provides- by moving postrgesql.conf, log files, etc, outside of the data
directory. For things that can't be moved out of the data directory
though, like postgresql.auto.conf, we should be handling those
transparently to the user.

I agree that there are some interesting cases to consider here though-
like doing a pg_dumpall against a standby resulting in something
different than if you did it against the primary because the
postgresql.auto.conf is different between them (something that I'm
generally supportive of having, and it seems everyone else is too). I
think having an option to control if the postgresql.auto.conf settings
are included or not in the pg_dumpall output would be a reasonable way
to deal with that though.

> Now, our current approach is fairly arguable. Given that GUCs on
> databases, users, functions, etc. are stored in the catalogs and
> subject to backup, restore, replication, etc., one might well expect
> that global settings would be handled the same way. I tend to think
> that would be nicer, though it would require solving the problem of
> how to back out bad changes that make the database not start up.
> Regardless of what you or anybody thinks about that, though, it's not
> how it works today and would require some serious engineering if we
> wanted to make it happen.

This sounds an awful lot like the arguments that I tried to make when
ALTER SYSTEM SET was first going in, but what's done is done and there's
not much to do but make the best of it as I can't imagine there'd be
much support for ripping it out.

I don't really agree about the need for 'some serious engineering'
either, but having an option for it, sure.

I do also tend to agree with Tom about making ALTER SYSTEM SET be
prohibited in explicitly read-only transactions, but still allowing it
to be run against replicas as that's a handy thing to be able to do.

> > As another example, do we need to consider that replacing pg_hba.conf
> > via pg_write_file should be allowed in a "read only" transaction?
>
> I don't really see what the problem with that is. It bothers me a lot
> more that CLUSTER can be run in a read-only transaction -- which
> actually changes stuff inside the database, even if not necessarily in
> a user-visible way -- than it does that somebody might be able to use
> the database to change something that isn't really part of the
> database anyway. And pg_hba.conf, like postgresql.conf, is largely
> treated as an input to the database rather than part of it.

I don't like that CLUSTER can be run in a read-only transaction either
(though it seems like downthread maybe some people are fine with
that..). I'm also coming around to the idea that pg_write_file()
probably shouldn't be allowed either, and probably not COPY TO either
(except to stdout, since that's a result, not a change operation).

> Somebody could create a user-defined function that launches a
> satellite into orbit and that is, I would argue, a write operation in
> the truest sense. You have changed the state of the world in a lasting
> way, and you cannot take it back. But, it's not writing to the
> database, so as far as read-only transactions are concerned, who
> cares?

I suppose there's another thing to think about in this discussion, which
are FDWs, if the idea is that read-only means "I don't want to make
changes in THIS database". I don't really feel like that's what marking
a transaction as 'read only' is intended to mean though.

When I think of starting a read-only transaction, I feel like it's
usually with the idea of "I want to play it safe and I don't want this
transaction to make ANY changes". I'm feeling more inclined that we
should be going out of our way to make darn sure that we respect that
request of the user, no matter what it is they're running. We can't
prevent user-created C-level functions from launching satellites, but
that's an untrusted language and therefore it's up to the function
author to manage the transaction and privilege system properly anyway,
not ours.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-01-10 20:23:48 Re: Removing pg_pltemplate and creating "trustable" extensions
Previous Message Tom Lane 2020-01-10 19:39:58 Re: Removing pg_pltemplate and creating "trustable" extensions