Re: PATCH: Warn users about tablespace abuse data loss risk

From: Ian Lawrence Barwick <barwick(at)gmail(dot)com>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PATCH: Warn users about tablespace abuse data loss risk
Date: 2014-02-12 06:16:58
Message-ID: CAB8KJ=iFi6V5xRojdWeSC+CWNksRsAB9MNkjB9Th37aQg38w8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

2014-02-12 14:06 GMT+09:00 David Johnston <polobo(at)yahoo(dot)com>:
> Ian Lawrence Barwick wrote
>> 2014-02-12 12:52 GMT+09:00 Craig Ringer <
>
>> craig@
>
>> >:
>>> Hi all
>>>
>>> I've just seen another case of data loss due to misuse of /
>>> misunderstanding of tablespaces:
>>>
>>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalling-postgres
>>>
>>> and it's prompted me to write some docs amendments to make it more
>>> obvious that *you shouldn't do that*.
>>>
>>> Not that it'll stop people, but it'll at least mean they can't say we
>>> didn't warn them.
>>>
>>> This is actually quite important, because many users are used to MySQL's
>>> MyISAM, where each table contains its own metadata and is readable by
>>> simply copying the table into a different MySQL install's data
>>> directory. It doesn't even have to be the same version! Users are
>>> clearly surprised that PostgreSQL tablespaces don't have the same
>>> properties.
>>>
>>> Thoughts?
>>
>> People still use MyISAM!?
>>
>> I had a similar issue pop up at work a while back, having something
>> explicit to point to is definitely a good idea.
>>
>> Suggestion for the first paragraph of the patch (sorry I can't provide it
>> in
>> patch form right now):
>>
>> Even if they are located outside the main PostgreSQL data directory,
>> tablespaces
>> are an integral part of the database cluster and
>> <emphasis>
>> cannot
>> </emphasis>
>> be
>> treated as an autonomous collection of data files. They rely on
>> metadata contained
>> in the main data directory, without which they are useless. In
>> particular, tablespaces
>> cannot be reattached to a different database cluster, and backing up
>> individual
>> tablespaces makes no sense as a backup/redundancy method. Similarly,
>> if you lose a
>> tablespace (file deletion, disk failure, etc) the main database may
>> become unreadable
>> or fail to start.
>>

> While providing additional warnings is good and necessary it may also help
> to be more descriptive as to in what situations tablespaces are appropriate
> and/or necessary so that people leave with a better understanding of why the
> feature exists and not just trying to know what not to use it for. It goes
> against the more prescriptive tone of the documentation generally but both
> approaches work well together to tackle the knowledge/understanding gap some
> users seem to have.

The warning would appear on this page:

http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html

which describes what tablespaces *can* do, but unless you're familiar with the
structure of the PostgreSQL data directories, it's not obvious what you *can't*
do. I recall reading a blog post a while back about tablespaces being "archived"
to the cloud with disastrous results, and a quick search pulls up
stuff like this:

http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instance

so it's definitely not a niche issue. Something "official" to link to
would be very
useful in this kind of situation. That doesn't preclude the general
documentation
being improved of course.

Regards

Ian Barwick

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Ian Lawrence Barwick 2014-02-12 06:35:09 Re: PATCH: Warn users about tablespace abuse data loss risk
Previous Message David Johnston 2014-02-12 05:06:33 Re: PATCH: Warn users about tablespace abuse data loss risk