Re: Sort-of replication for reporting purposes

From: Stuart Bishop <stuart(at)stuartbishop(dot)net>
To: Ivan Voras <ivoras(at)gmail(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Sort-of replication for reporting purposes
Date: 2017-01-13 11:00:31
Message-ID: CADmi=6OyCuOo85=Y2xxe_u7zHr0nDfwjLhNeWWBVYfsps8ZzGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 7 January 2017 at 02:33, Ivan Voras <ivoras(at)gmail(dot)com> wrote:

>
>
> On 6 Jan 2017 8:30 p.m., "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> wrote:
>
> On Fri, Jan 6, 2017 at 12:24 PM, Ivan Voras <ivoras(at)gmail(dot)com> wrote:
> > Hello,
> >
> > I'm investigating options for an environment which has about a dozen
> servers
> > and several dozen databases on each, and they occasionally need to run
> huge
> > reports which slow down other services. This is of course "legacy code".
> > After some discussion, the idea is to offload these reports to separate
> > servers - and that would be fairly straightforward if not for the fact
> that
> > the report code creates temp tables which are not allowed on read-only
> hot
> > standby replicas.
> >
> > So, the next best thing would be to fiddle with the storage system and
> make
> > lightweight snapshots of live database clusters (their storage volumes)
> and
> > mount them on the reporting servers when needed for the reports. This is
> a
> > bit messy :-)
> >
> > I'm basically fishing for ideas. Are there any other options available
> which
> > would offer fast replication-like behaviour ?
> >
> > If not, what practices would minimise problems with the storage snapshots
> > idea? Any filesystem options?
>
> I've always solved this with slony replication, but pg_basebackup
> should be pretty good for making sort of up to date slave copies. Just
> toss a recovery.conf file and touch whatever failover file the slave
> expects etc.
>
>
> I forgot to add one more information, the databases are 50G+ each so doing
> the base backup on demand over the network is not a great option.
>

If you don't want to rebuild your report databases, you can use PostgreSQL
built in replication to keep them in sync. Just promote the replica to a
primary, run your reports, then wind it back to a standby and let it catch
up. pg_rewind might be able to wind it back, or you could use a filesystem
snapshot from before you promoted the replica to a primary. You do need to
ensure that the real primary keep enough WAL logs to cover the period your
report database is broken out.

Personally though, I'd take the opportunity to set up wal shipping and
point in time recovery on your primary, and rebuild your reporting database
regularly from these backups. You get your fresh reporting database on
demand without overloading the primary, and regularly test your backups.

--
Stuart Bishop <stuart(at)stuartbishop(dot)net>
http://www.stuartbishop.net/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ivan Voras 2017-01-13 11:17:48 Re: Sort-of replication for reporting purposes
Previous Message ProPAAS DBA 2017-01-12 17:48:24 Re: Sort-of replication for reporting purposes