Re: Help making tablespaces work for my application

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Help making tablespaces work for my application
Date: 2011-06-24 07:26:53
Message-ID: 4E043C3D.7080506@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 06/23/2011 09:37 AM, Natusch, Paul wrote:
>
> I have an application for which data is being written to many disks
> simultaneously. I would like to use a postgres table space on each
> disk. If one of the disks crashes it is tolerable to lose that data,
> however, I must continue to write to the other disks.
>

Tablespaces are not useful for implementing this idea yet.

> 1. There is a single WAL log for the entire cluster, located in the
> pg_log subdirectory. If the disk containing the pg_log file crashed,
> does that mean the system would come to a halt. Is there anyway to
> distribute this data so that WAL is located on the same media as the
> table space? An alternative would be to use raid with the disk that
> stores the pg_log subdirectory but that adds cost to the system.
>

Loss of the pg_xlog subdirectory and the WAL contained in it normally
results in catastrophic database failure. Recommended practice is to
use a RAID-1 volume to make odds of that failure lower.
>
> 2. If #1 was solved by using the raid approach, what happens if one
> of the disks containing one of my table spaces crashes. At some point
> postgres will want to write the data from the WAL file to the crashed
> (unavailable) disk. Will postgres will be blocked at this point?
> Is there some way to notify postgres that a specific disk is no
> longer available and that the entries in the WAL for this disk should
> either be purged or ignored? ( I'm willing to "throw away" the data on
> the crashed disk).
>

PostgreSQL can't be expected to operate sanely when faced with the loss
of an individual tablespace. It may be possible to recover from it, but
you'll be doing something it's not designed to handle, and that effort
may not succeed.

Note that any tablespace failure is likely to require taking down the
database to repair the involved tablespaces, so you're likely to have
downtime between a component failure and when you notice to take
action. The database really does not like having tablespaces just go
away in the middle of operations.

PostgreSQL 9.1 (not released yet, currently in beta) includes a new
feature called "unlogged tables" that might make this sort of deployment
possible. If you created a tablespace for disposable data and put an
unlogged table onto it, loss of that tablespace would me much less
likely to cause a problem. So long as you recreated a new space for the
unlogged table after restarting, you could probably recover having only
lost the data on the crashed disk in this situation.

> Clearly using raid on all of the disks would be a solution, but that
> is cost prohibitive.
>
>

On a cheap server I can easily RAID-1 mirror a pair of drives on Linux
using software RAID, and individual drives are $50 to $100 each. If
your data isn't worth that much, And even that's not enough to really
make me feel secure about the data--you really need to keep another copy
around as a backup, too. You can treat your data as disposable and
expect to lose it when any single component fails, or you can include
some good redundancy practices in the design to reduce odds of a
failure. There really isn't really a good solution providing partial
protection in the middle of those two.

--
Greg Smith 2ndQuadrant USgreg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance":http://www.2ndQuadrant.com/books

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Greg Smith 2011-06-24 07:29:51 Re: Oracle / PostgreSQL comparison...
Previous Message John R Pierce 2011-06-24 07:25:42 Re: Oracle / PostgreSQL comparison...