From: | Alfred Perlstein <alfred(at)freebsd(dot)org> |
---|---|
To: | Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Geoff Winkless <pgsqladmin(at)geoff(dot)dj>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Why we lost Uber as a user |
Date: | 2016-08-04 16:52:56 |
Message-ID: | 502d330e-09fb-e22a-911a-58806490fc00@freebsd.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 8/4/16 2:00 AM, Torsten Zuehlsdorff wrote:
>
>
> On 03.08.2016 21:05, Robert Haas wrote:
>> On Wed, Aug 3, 2016 at 2:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> I don't think they are saying that logical replication is more
>>>> reliable than physical replication, nor do I believe that to be true.
>>>> I think they are saying that if logical corruption happens, you can
>>>> fix it by typing SQL statements to UPDATE, INSERT, or DELETE the
>>>> affected rows, whereas if physical corruption happens, there's no
>>>> equally clear path to recovery.
>>>
>>> Well, that's not an entirely unreasonable point, but I dispute the
>>> implication that it makes recovery from corruption an easy thing to do.
>>> How are you going to know what SQL statements to issue? If the master
>>> database is changing 24x7, how are you going to keep up with that?
>>
>> I think in many cases people fix their data using business logic. For
>> example, suppose your database goes down and you have to run
>> pg_resetxlog to get it back up. You dump-and-restore, as one does,
>> and find that you can't rebuild one of your unique indexes because
>> there are now two records with that same PK. Well, what you do is you
>> look at them and judge which one has the correct data, often the one
>> that looks more complete or the one with the newer timestamp. Or,
>> maybe you need to merge them somehow. In my experience helping users
>> through problems of this type, once you explain the problem to the
>> user and tell them they have to square it on their end, the support
>> call ends. The user may not always be entirely thrilled about having
>> to, say, validate a problematic record against external sources of
>> truth, but they usually know how to do it. Database bugs aren't the
>> only way that databases become inaccurate. If the database that they
>> use to keep track of land ownership in the jurisdiction where I live
>> says that two different people own the same piece of property,
>> somewhere there is a paper deed in a filing cabinet. Fishing that out
>> to understand what happened may not be fun, but a DBA can explain that
>> problem to other people in the organization and those people can get
>> it fixed. It's a problem, but it's fixable.
>>
>> On the other hand, if a heap tuple contains invalid infomask bits that
>> cause an error every time you read the page (this actually happened to
>> an EnterpriseDB customer!), the DBA can't tell other people how to fix
>> it and can't fix it personally either. Instead, the DBA calls me.
>
> After reading this statement the ZFS filesystem pops into my mind. It
> has protection build in against various problems (data degradation,
> current spikes, phantom writes, etc).
>
> For me this raises two questions:
>
> 1) would the usage of ZFS prevent such errors?
>
> My feeling would say yes, but i have no idea about how a invalid
> infomask bit could occur.
>
> 2) would it be possible to add such prevention to PostgreSQL
>
> I know this could add a massive overhead, but it its optional this
> could be a fine thing?
Postgresql is very "zfs-like" in its internals. The problem was a bug
in postgresql that caused it to just write data to the wrong place.
Some vendors use ZFS under databases to provide very cool services such
as backup snapshots, test snapshots and other such uses. I think Joyent
is one such vendor but I'm not 100% sure.
-Alfred
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2016-08-04 16:55:25 | Re: Heap WARM Tuples - Design Draft |
Previous Message | Andres Freund | 2016-08-04 16:31:18 | Re: Heap WARM Tuples - Design Draft |