PostgreSQL notes and links



Local Notes

Some general concepts and experiences we've had:

There is one postmaster per database cluster. The postmaster uses common system files/tables to maintain all databases in the cluster. There is also a common write-ahead-log directory which serves all databases in the cluster.

The initdb command creates the common system files, and we can specify that they be placed anywhere. In our case, I have left these common files in the default of /var/lib/postgres/data because it makes sense to keep the WAL files & system tables on a different disk from the data itself. The total initial space used by the common files is about 20 MB at the outset, of which 16MB is the first WAL file. initdb *must* be run by the postgres user. [su ; su - postgres ; initdb]

The location of the common files may be specified (assuming an initdb has been previously done for that location) when starting the postmaster. By default, databases in the cluster are created in this location as well, but it is possible to specify a different location for each database in the cluster using the procedure below:

The initlocation command creates the data areas which may then be used by the postmaster. In our case, we have created PGDATA_DEMO (/data/pgdata_demo), and PGDATA_ALPHA (/data/pgdata_alpha). [su - postgres ; initlocation PGDATA_DEMO ; initlocation PGDATA_ALPHA ] We anticipate adding PGDATA_PROD and PGDATA_STAGING later.

Note: The environment variables listed above must be defined and exported in /var/lib/postgres/.profile before starting the postmaster. It makes most sense to define all of these as step one.

The createdb command creates the databases themselves. The postmaster must be running before this happens [/etc/init.d/posgresql start]. We have created databases in the locations noted above. [su ; /etc/init.d/postgresql start ; su - postgres ; createdb demo -D PGDATA_DEMO ; createdb alpha -D PGDATA_ALPHA] note that there are no '$' before the environment variables.

To dump (export) a database, this command works: [pg_dump demo >demo.dump]

To restore (import) a database, this command works: [psql -d alpha -f demo.dump]

The two commands shown above exported the contents of demo & reloaded them in alpha. These are fairly small (41Mb) databases. It took about one minute to dump & restore. The created file is simply an SQL file containing creates, inserts, etc... This means we'll want to compress it before sending to tape.

WAL files are basically the same thing as redo logs in Oracle- they are used to both buffer & record transactions against the database tables, resulting in better IO performance and immediate logging of transactions to ensure consistent recovery results. Each WAL file is 16 Mb, and when the current file is 75% full, a new WAL file is created. We can control the number of WAL files initially created and the space between checkpoints via parameters in /etc/postgres/postgresql.conf. I don't yet know if there is any limit on the number of WAL files or a mechanism for automatic cycling or archiving. I've submitted several questions about this to the pgsql-admin list.

The response I've gotten is that in v7.1.3, only 2-3 WAL files are ever created. I'm a bit suspicious of the 2-3... it seems as if the answer should be exactly one number, so take this information with a grain of salt

Some observed behavior:

After creating the demo data & doing a dump/restore of it into alpha, we had three WAL files (D,E,F).

After restarting the postmaster, we had two WAL files (E,F).

My conjecture is that un-needed WAL files get cleaned out when a restart is done, but perhaps not in the normal course of events.

Doing a vacuum analyze on both databases did not cause a third WAL file to be created- conjecture: vacuum doesn't cause much impact on WAL files, *or* since I had recently done a vacuum on demo & alpha was created frm a dump, neither had much to be done.

We currently have plenty of space in /var for WAL files. After I recreated the databases using the procedure outlined above, we had recovered about 100 Mb. I'm pretty sure that in my confusion, I did an initdb for our designated data area and assumed that a createdb would then use the same area. Since the evidence is all gone, I'm not certain what happened, but regardless, the procedure above places everything where we want it.

Useful links

Note: PDF files are cached locally & hence may become stale