From: | Tiemo Kieft <t(dot)kieft(at)infinitesimal(dot)nl> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Streaming replication and sharding |
Date: | 2013-02-17 08:02:29 |
Message-ID: | CAAJu91fxXzTrOdfuwdx62KxgktYJ83Z+kZNo-KrJszZzV8P7jA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
We are developing an application that uses various web analytics packages
(like Google Analytics) to run analyses on. We are currently in closed beta
stadium where we don't have a lot of data in the database, but at some
point it will grow considerably.
We basically have two different sets of data, on the one hand we have raw
metrics from the datasource, and on the other hand we have account and meta
information. The former can be re-downloaded at any time, and will grow to
quite large sizes. The latter set is the one that we really care about, and
don't want to risk losing.
Currently we plan on using streaming replication to replicate all data to
at least one slave, for the near future this will do, since we can run some
of the large (read-only) aggregation queries on the slave database. In the
future the dataset might grow to the point where we need to start thinking
about sharding. The analytics data can be sharded on a per-customer basis,
and doesn't have to be replicated.
Since Postgres doesn't support per-table streaming replication (as far as I
can tell), the only solution would be to run two separate instances of
postgres per server. One instance is replicated to all servers, and will
contain account and other important information. The other instance is used
to store analytics data. Is this a viable way of solving this problem, or
are we overlooking something?
The problem is not really immediate, as the dataset is currently small
enough to fit on one machine (and replicated to a second), just want to be
future proof, and get this solved before the problems start.
--
- Tiemo
From | Date | Subject | |
---|---|---|---|
Next Message | Tim Uckun | 2013-02-17 08:16:16 | Why is this a cross join? |
Previous Message | Kevin Grittner | 2013-02-16 18:30:44 | Re: PG9.2.3. Query hanging: SELECT count(*) FROM pg_catalog.pg_class... |