From: | Gurjeet Singh <gurjeet(at)singh(dot)im> |
---|---|
To: | PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Proposing pg_hibernate |
Date: | 2014-02-04 00:18:54 |
Message-ID: | CABwTF4Ui_anAG+ybseFunAH5Z6DE9aw2NPdy4HryK+M5OdXCCA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Please find attached the pg_hibernate extension. It is a
set-it-and-forget-it solution to enable hibernation of Postgres
shared-buffers. It can be thought of as an amalgam of pg_buffercache and
pg_prewarm.
It uses the background worker infrastructure. It registers one worker
process (BufferSaver) to save the shared-buffer metadata when server is
shutting down, and one worker per database (BlockReader) when restoring the
shared buffers.
It stores the buffer metadata under $PGDATA/pg_database/, one file per
database, and one separate file for global objects. It sorts the list of
buffers before storage, so that when it encounters a range of consecutive
blocks of a relation's fork, it stores that range as just one entry, hence
reducing the storage and I/O overhead.
On-disk binary format, which is used to create the per-database save-files,
is defined as:
1. One or more relation filenodes; stored as r<relfilenode>.
2. Each realtion is followed by one or more fork number; stored as
f<forknumber>
3. Each fork number is followed by one or more block numbers; stored as
b<blocknumber>
4. Each block number is followed by zero or more range numbers; stored as
N<number>
{r {f {b N* }+ }+ }+
Possible enhancements:
- Ability to save/restore only specific databases.
- Control how many BlockReaders are active at a time; to avoid I/O storms.
- Be smart about lowered shared_buffers across the restart.
- Different modes of reading like pg_prewarm does.
- Include PgFincore functionality, at least for Linux platforms.
The extension currently works with PG 9.3, and may work on 9.4 without any
changes; I haven't tested, though. If not, I think it'd be easy to port to
HEAD/PG 9.4. I see that 9.4 has put a cap on maximum background workers via
a GUC, and since my aim is to provide a non-intrusive no-tuning-required
extension, I'd like to use the new dynamic-background-worker infrastructure
in 9.4, which doesn't seem to have any preset limits (I think it's limited
by max_connections, but I may be wrong). I can work on 9.4 port, if there's
interest in including this as a contrib/ module.
To see the extension in action:
.) Compile it.
.) Install it.
.) Add it to shared_preload_libraries.
.) Start/restart Postgres.
.) Install pg_buffercache extension, to inspect the shared buffers.
.) Note the result of pg_buffercache view.
.) Work on your database to fill up the shared buffers.
.) Note the result of pg_buffercache view, again; there should be more
blocks than last time we checked.
.) Stop and start the Postgres server.
.) Note the output of pg_buffercache view; it should contain the blocks
seen just before the shutdown.
.) Future server restarts will automatically save and restore the blocks in
shared-buffers.
The code is also available as Git repository at
https://github.com/gurjeet/pg_hibernate/
Demo:
$ make -C contrib/pg_hibernate/
$ make -C contrib/pg_hibernate/ install
$ vi $B/db/data/postgresql.conf
$ grep shared_preload_libraries $PGDATA/postgresql.conf
shared_preload_libraries = 'pg_hibernate' # (change requires restart)
$ pgstart
waiting for server to start.... done
server started
$ pgsql -c 'create extension pg_buffercache;'
CREATE EXTENSION
$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not
null group by reldatabase;'
count
-------
163
14
(2 rows)
$ pgsql -c 'create table test_hibernate as select s as a, s::char(1000) as
b from generate_series(1, 100000) as s;'
SELECT 100000
$ pgsql -c 'create index on test_hibernate(a);'
CREATE INDEX
$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not
null group by reldatabase;'
count
-------
2254
14
(2 rows)
$ pgstop
waiting for server to shut down....... done
server stopped
$ pgstart
waiting for server to start.... done
server started
$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not
null group by reldatabase;'
count
-------
2264
17
(2 rows)
There are a few more blocks than the time they were saved, but all the
blocks from before the restart are present in shared buffers after the
restart.
Best regards,
--
Gurjeet Singh http://gurjeet.singh.im/
Attachment | Content-Type | Size |
---|---|---|
pg_hibernate.tgz | application/x-gzip | 8.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2014-02-04 00:40:33 | Re: Failure while inserting parent tuple to B-tree is not fun |
Previous Message | Stephen Frost | 2014-02-04 00:13:46 | Re: [COMMITTERS] pgsql: Include planning time in EXPLAIN ANALYZE output. |