From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | page corruption on 8.3+ that makes it to standby |
Date: | 2010-07-27 18:06:15 |
Message-ID: | 1280253975.23350.99.camel@jdavis |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I reported a problem here:
http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php
Perhaps I used a poor subject line, but I believe it's a serious issue.
That reproducible sequence seems like an obvious bug to me on 8.3+, and
what's worse, the corruption propagates to the standby as I found out
today (through a test, fortunately).
The only mitigating factor is that it doesn't actually lose data, and
you can fix it (I believe) with zero_damaged_pages (or careful use of
dd).
There are two fixes that I can see:
1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and
PageSetTLI() if the page is not new. This seems slightly awkward because
most WAL replay stuff doesn't have to worry about zero pages, but in
this case I think it does.
2. Have copy_relation_data() initialize new pages. I don't like this
because (a) it's not really the job of SET TABLESPACE to clean up zero
pages; and (b) it could be an index with different special size, etc.,
and it doesn't seem like a good place to figure that out.
Comments?
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-07-27 18:11:02 | Re: do we need to postpone beta4? |
Previous Message | Robert Haas | 2010-07-27 18:04:04 | do we need to postpone beta4? |