Re: Duplicating data folder without tablespace, for read access

From: Jack Cushman <jcushman(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Duplicating data folder without tablespace, for read access
Date: 2018-08-28 20:21:24
Message-ID: CAEv_OHW5QZOznQsSfsas+uXXPXh=f+Puc8v+eqHQz77s3pOr7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

To follow up, ZFS snapshots (appear to) offer a great solution to the
problem I posed a couple of weeks ago, and avoid any hacking around with
misuse of tablespaces.

My goal was to have a database with a 100GB table and a 600GB table, and to
routinely and efficiently clone the 100GB table and its indexes to a
cluster on another machine.

The general procedure for solving this with ZFS is:

- zfs snapshot the source data directory (after shutting down the database
or taking appropriate steps to get a clean copy)
- zfs clone to the same machine. This takes no actual disk space or time
because of copy-on-write.
- Run postgres using the cloned data directory and truncate unwanted
tables. This still takes minimal real disk space.
- zfs send the cloned data directory to the remote machine. If running
repeatedly (as I am), use incremental send to avoid resending unchanged
blocks.

The upshot is to waste minimal time copying bits that are unwanted or
haven't changed. To mix in Stephen's suggestion, do this from a backup
server to exercise the backups.

This blog post was helpful in figuring out how to get all that working:
https://blog.2ndquadrant.com/pg-phriday-postgres-zfs/

Thanks,
Jack

On Tue, Aug 14, 2018 at 11:57 AM Jack Cushman <jcushman(at)gmail(dot)com> wrote:

> Hi --
>
> I'm wondering whether, in my specific situation, it would be safe to copy
> a database cluster's data folder, and bring up the copy for read access,
> without copying a tablespace linked from it. My situation (described below)
> involves a database with a 100GB table and a 600GB table where I want to
> routinely clone just the 100GB table for web access.
>
> ---
>
> For context, the last discussion I've found is from 2013, in this blog
> post from Christophe Pettus and response from Tom Lane:
>
>
> https://thebuild.com/blog/2013/03/10/you-cannot-recover-from-the-loss-of-a-tablespace/
> https://www.postgresql.org/message-id/19786.1367378009@sss.pgh.pa.us
>
> In that discussion, Christophe summarized the situation this way:
>
> > I would not count on it. And if it works 100% reliably now, it might
> not on a future version of PostgreSQL.
>
> > As Josh Berkus pointed out to my off-list, there are two competing
> definitions of the term "recover" in use here:
>
> > 1. In my blog post, the definition of "recover" was "bring up the
> database without having unusually extensive knowledge of PostgreSQL's
> internals."
> > 2. For Tom, the definition of "recover" is "bring up the database if you
> have appropriate knowledge of PostgreSQL's internals."
>
> > You can't recover from the lost of a tablespace per definition #1. You
> can per definition #2.
>
> > I'd strongly suggest that relying on definition #2, while absolutely
> correct, is a poor operational decision for most users.
>
>
> https://www.postgresql.org/message-id/FABAC7F1-3172-4B5D-8E56-0B3C579980EC%40thebuild.com
>
> ---
>
> Now here's the situation where I want to do what Christophe said not to
> do: :)
>
> I have a large database of text, with a 600GB table and a 100GB table
> connected by a join table. They both see occasional updates throughout the
> week. Once a week I want to "cut a release," meaning I will clone just the
> 100GB table and copy it to a "release" server for read-only web access.
>
> My procedure would be:
>
> - keep the 600GB table on a separate tablespace
> - cleanly stop postgres on both servers
> - copy the data folder to the release server
> - delete pg_tblspc/* on the release server
> - start postgres on both servers
>
> In local testing this seems to work -- the release server works fine, and
> I only get an error message if I try to access the missing tables, which is
> expected. But are there reasons this is going to bite me if I try it in
> production? I'm hoping it helps that (a) I'm only doing read access, (b) I
> can cleanly stop both servers before cutting a release, and (c) I'm not
> worried about losing data, since it's just an access copy.
>
> Alternatives I've considered:
>
> - I could pg_dump and restore, but the 100GB table has lots of indexes and
> I'd rather not have to reindex on the release server each week.
> - I could replicate with pglogical and use some sort of blue-green setup
> on the release server to cut a release, but this adds a lot of moving
> parts, especially to deal with schema migrations.
>
> Thanks for any advice you might have!
>
> -Jack
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2018-08-28 20:51:51 Re: archive items not in correct section order
Previous Message Tim Clarke 2018-08-28 19:39:05 Re: archive items not in correct section order