Re: recovery getting interrupted is not so unusual as it used to be

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: recovery getting interrupted is not so unusual as it used to be
Date: 2010-06-05 01:20:53
Message-ID: AANLkTin1xAEY582p3dE_Ykr6uwqw4frt25NRiwru9HEH@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 4, 2010 at 8:21 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Jun 3, 2010, at 5:25 , Robert Haas wrote:
>> On Wed, Jun 2, 2010 at 10:34 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>>>> Oh.  Well, if that's the case, then I guess I lean toward applying the
>>>> patch as-is.  Then there's no need for the caveat "and without manual
>>>> intervention".
>>>
>>> That still leaves the messages awfully ambiguous concerning the cause (data corruption) and the effect (crash during recovery).
>>>
>>> How about
>>> "If this has occurred more than once, it is probably caused by corrupt data and you have to use the latest backup for recovery"
>>> for the crash recovery case and
>>> "If this has occurred more than once, it is probably caused by corrupt data and you have to choose an earlier recovery target"
>>> for the PITR case.
>>>
>>> I don't see why currently only the PITR-case includes the "more than once" clause. Its probably supposed to prevent unnecessarily alarming the user if the "crash" was in fact a stray SIGKILL or an out-of-memory condition, which seems equally likely in both cases.
>>
>> I've applied the patch for now - we can fix the wording of the other
>> messages with a follow-on patch if we agree on what they should say.
>> I don't like the use of the phrase "you have to", particularly...  I
>> would tend to leave the archive recovery message alone and change the
>> crash recovery message to be more like it.
>
> Since a loose log of this shed gave me quite a bump on my forehead once, one last attempt at fixing it.
>
> I've tried to keep this as similar as possible to the existing message while making it less ambiguous about cause and effect.
>
> "If this has occurred more than once corrupt data might be the cause and you might need to choose an earlier recovery target".
> and
> "If this has occurred more than once corrupt data might be the cause and you might need to restore from backup".

How about:

If the database system is exiting unexpectedly during archive
recovery, some data might be corrupted and you might need to choose an
earlier recovery target.
If the database system is exiting unexpectedly during crash recovery,
some data might be corrupted and you might need to restore from
backup.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joseph Adams 2010-06-05 01:55:56 Re: functional call named notation clashes with SQL feature
Previous Message Florian Pflug 2010-06-05 00:21:39 Re: recovery getting interrupted is not so unusual as it used to be