Quick Links

proposal: CREATE DATABASE vs. (partial) CHECKPOINT

From:	Tomas Vondra <tv(at)fuzzy(dot)cz>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	proposal: CREATE DATABASE vs. (partial) CHECKPOINT
Date:	2014-10-26 21:47:39
Message-ID:	544D6BFB.4060906@fuzzy.cz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi all,

currently, CREATE DATABASE forces an immediate checkpoint (actually, it
forces two, but the time interval is usually rather small). For
traditional deployments this is not a big deal, because creating a
database is a rare event, and may be planned to off-peak times.

However for shared (cloud-like) deployments, this is not the case. E.g.
we're hosting hundreds (or even thousands) of customer databases on some
clusters, and creating a new database is quite common.

This turns the checkpoints into a significant pain point for us, because
it forces a write of all the dirty buffers from all the databases. No
matter how well we tune the spread checkpoints, this makes it
inefficient and causes significant I/O spikes (especially with larger
shared_buffer values). It also leads to high duration of the CREATE
DATABASE command, making it rather inpractical for 'interactive' use (a
user hitting a button in a UI or something).

Based on the talks from pgconf.eu, where I've seen this mentioned in at
least two talks (and in the hallway track), we're not alone. I'd like to
address this, if possible.

The usual workaround for this is "create the databases in advance" but
that's not always possible (e.g. when having more than handful of
templates, or when the template evolves over time).

After eyeballing the code for an hour or two, I think CREATE DATABASE
should be fine with performing only a 'partial checkpoint' on the
template database - calling FlushDatabaseBuffers and processing unlink
requests, as suggested by the comment in createdb().

It's not exactly trivial change, but it does not seem frighteningly
difficult coding either.

The templates are usually static, so this would minimize both the CREATE
DATABASE duration and disruption to the cluster it causes.

My fear however is that this while the code will work, it will break the
recovery in some subtle way (as illustrated by the comments about 8.0
PITR bugs in createdb).

Am I missing something that makes this dead in the water?

regards
Tomas

Responses

Re: proposal: CREATE DATABASE vs. (partial) CHECKPOINT at 2014-10-27 09:47:05 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Noah Misch	2014-10-26 22:22:01	Re: narwhal and PGDLLIMPORT
Previous Message	Andrew Dunstan	2014-10-26 20:57:09	Re: strip nulls functions for json and jsonb