Re: PostgreSQL configuration

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql(at)mohawksoft(dot)com
Cc: "Stephan Szabo" <sszabo(at)megazone(dot)bigpanda(dot)com>, "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>, rm_pg(at)cheapcomplexdevices(dot)com, "Christopher Browne" <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL configuration
Date: 2004-04-12 20:21:24
Message-ID: 13376.1081801284@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I just had a thought about this: seems like a big part of the objection
is the risk of specifying -C and -D that don't go together. Well, what
if they were the same switch? Consider the following simplification of
the proposed patch:

1. Postmaster has just one switch, '-D datadir' with fallback to
environmental variable PGDATA, same as it ever was.

2. The files that must be found in this directory are the configuration
files, namely postgresql.conf, pg_hba.conf, pg_ident.conf. (And any
files they include are taken as relative to this directory if not
specified with absolute path. We'll still add the #include facility
to postgresql.conf; the others have it already IIRC.)

3. postgresql.conf can contain a new changeable-only-at-startup
configuration setting which we need to think of a good name for.
("datadir" seems confusing to me in this context, though maybe it would
do; anyway I haven't got a better idea yet.) All the non-configuration
files are located under that directory. Of course it defaults to being
the -D directory if not specified in postgresql.conf.

If we do things this way, we have the following properties:

* Default behavior is same as it ever was, in particular there is no
difficulty in making a test installation in a nontypical place.

* Config files can easily be separated from data and can be backed up
separately (no need for the etc/ or config/ subdirectory Bruce
suggested).

* It is not directly possible to use the same config with multiple
databases. However one can easily imagine pointing the postmaster
to a config file that contains only a "datadir = " spec and a
#include of a sharable config file. (I have to confess not having
thought about doing that in connection with the original patch
proposal.)

* If you want to think of this as config-centric, you can; if you want
to think of it as data-centric, you can do that too. It's agnostic.

A typical setup for sharable config files would look like this:
you make directories named say "/etc/postgresql/postmasterN" which
will be the -D targets for each of your postmasters. These contain
postgresql.conf files that contain "datadir = someplace" and
"include ../sharedconfigfile" and nothing else. Shared config files
live in /etc/postgresql, per-database ones in its subdirectories.

This notion is really almost the same as the patch-as-submitted, but
there are a couple of key differences:

* I did not like the patch's confusion over -C-specifies-config-directory
versus -C-specifies-config-file. One big reason not to like it is that
in the latter case it's not very clear what is the origin directory for
#include references in the config files. I think we would do fine with
less confusion if we adopt just the specify-a-config-directory behavior.
I don't see a use-case that justifies the config-file option nor the
separate postgresql.conf entries for pg_hba.conf and pg_ident.conf
(which would have to be extended any time we add another config file).
Surely requiring a separate config subdirectory for each postmaster
isn't an objectionable amount of overhead.

* There isn't a way to get things wrong on the command line. Well,
actually there is: if the "datadir" parameter works the same as all
other GUC parameters then one could override it on the command line
with "-c datadir=whatever". Depending on how strongly you feel about
that being a Bad Idea, we could imagine putting in a special prohibition
against it. But at least it wouldn't be the designed-in way of working
with shared config files.

* Barring the "-c datadir" scenario, there is a strong link from a
config subdirectory to its data area. A simple addition to the proposal
would be to add a back-link: on first start, the postmaster would
automatically make a file in the data directory that contains the
absolute path of the config dir; on subsequent starts, check it still
matches. This provides a simple interlock against accidentally starting
a postmaster with the wrong config files for the data area. (You could
break the interlock at need by deleting the back-link file.) In
particular, if you'd not bothered to remove the config files placed in
the data area by initdb, something like this is useful to ensure you
don't accidentally start the postmaster with -D pointing straight at
the data area where previously you'd pointed to a config directory.
It also provides documentation in both places about where the other
place is.

Something that remains unclear to me is what to do with the proposed
patch to support a secondary PID file. This strikes me as a solution
in search of a problem --- it was claimed that this makes it easier to
manipulate the postmaster with "standard Unix tools", but what tools are
those and do we really want people frobbing the postmaster with them?
Again I'm not sold on the use-case for the feature.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message pgsql 2004-04-13 03:14:25 Re: PostgreSQL configuration
Previous Message Sean Chittenden 2004-04-12 20:05:55 Re: Information/schema hiding...