From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: page corruption on 8.3+ that makes it to standby |
Date: | 2010-07-27 19:50:08 |
Message-ID: | AANLkTikFxgRpmdmz=ez2-vta_3kExzObxgnxBjkna_35@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jul 27, 2010 at 2:06 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> I reported a problem here:
>
> http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php
>
> Perhaps I used a poor subject line, but I believe it's a serious issue.
> That reproducible sequence seems like an obvious bug to me on 8.3+, and
> what's worse, the corruption propagates to the standby as I found out
> today (through a test, fortunately).
I think that the problem is not so much your choice of subject line as
your misfortune to discover this bug when Tom and Heikki were both on
vacation.
> The only mitigating factor is that it doesn't actually lose data, and
> you can fix it (I believe) with zero_damaged_pages (or careful use of
> dd).
>
> There are two fixes that I can see:
>
> 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and
> PageSetTLI() if the page is not new. This seems slightly awkward because
> most WAL replay stuff doesn't have to worry about zero pages, but in
> this case I think it does.
>
> 2. Have copy_relation_data() initialize new pages. I don't like this
> because (a) it's not really the job of SET TABLESPACE to clean up zero
> pages; and (b) it could be an index with different special size, etc.,
> and it doesn't seem like a good place to figure that out.
It appears to me that all of the callers of log_newpage() other than
copy_relation_data() do so with pages that they've just constructed,
and which therefore can't be new. So maybe we could just modify
copy_relation_data to check PageIsNew(buf), or something like that,
and only call log_newpage() if that returns true.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2010-07-27 19:53:40 | Re: do we need to postpone beta4? |
Previous Message | Robert Haas | 2010-07-27 19:36:48 | Re: ALTER TABLE ... DISABLE TRIGGER vs. AccessExclusiveLock |