From: | "MauMau" <maumau307(at)gmail(dot)com> |
---|---|
To: | "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com> |
Cc: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Back-branch update releases coming in a couple weeks |
Date: | 2013-01-24 14:53:53 |
Message-ID: | 7AE503F0CB83442082C20ECE2B0A6E4B@maumau |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
> On Thu, Jan 24, 2013 at 7:42 AM, MauMau <maumau307(at)gmail(dot)com> wrote:
>> I searched through PostgreSQL mailing lists with "WAL contains references
>> to
>> invalid pages", and i found 19 messages. Some people encountered similar
>> problem. There were some discussions regarding those problems (Tom and
>> Simon Riggs commented), but those discussions did not reach a solution.
>>
>> I also found a discussion which might relate to this problem. Does this
>> fix
>> the problem?
>>
>> [BUG] lag of minRecoveryPont in archive recovery
>> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp
>
> Yes. Could you check whether you can reproduce the problem on the
> latest REL9_2_STABLE?
I tried to produce the problem by doing "pg_ctl stop -mi" against the
primary more than ten times on REL9_2_STABLE, but the problem did not
appear. However, I encountered the crash only once out of dozens of
failovers, possibly more than a hundred times, on PostgreSQL 9.1.6. So, I'm
not sure the problem is fixed in REL9_2_STABLE.
I'm wondering if the fix discussed in the above thread solves my problem. I
found the following differences between Horiguchi-san's case and my case:
(1)
Horiguchi-san says the bug outputs the message:
WARNING: page 0 of relation base/16384/16385 does not exist
On the other hand, I got the message:
WARNING: page 506747 of relation base/482272/482304 was uninitialized
(2)
Horiguchi-san produced the problem when he shut the standby immediately and
restarted it. However, I saw the problem during failover.
(3)
Horiguchi-san did not use any index, but in my case the WARNING message
refers to an index.
But there's a similar point. Horiguchi-san says the problem occurs after
DELETE+VACUUM. In my case, I shut the primary down while the application
was doing INSERT/UPDATE. As the below messages show, some vacuuming was
running just before the immediate shutdown:
...
LOG: automatic vacuum of table "GOLD.scm1.tbl1": index scans: 0
pages: 0 removed, 36743 remain
tuples: 0 removed, 73764 remain
system usage: CPU 0.09s/0.11u sec elapsed 0.66 sec
LOG: automatic analyze of table "GOLD.scm1.tbl1" system usage: CPU
0.00s/0.14u sec elapsed 0.32 sec
LOG: automatic vacuum of table "GOLD.scm1.tbl2": index scans: 0
pages: 0 removed, 12101 remain
tuples: 40657 removed, 44142 remain system usage: CPU 0.06s/0.06u sec
elapsed 0.30 sec
LOG: automatic analyze of table "GOLD.scm1.tbl2" system usage: CPU
0.00s/0.06u sec elapsed 0.14 sec
LOG: received immediate shutdown request
...
Could you tell me the details of the problem discussed and fixed in the
upcoming minor release? I would to like to know the phenomenon and its
conditions, and whether it applies to my case.
Regards
MauMau
From | Date | Subject | |
---|---|---|---|
Next Message | Hari Babu | 2013-01-24 15:13:50 | Re: Passing connection string to pg_basebackup |
Previous Message | Andrew Dunstan | 2013-01-24 13:50:36 | Re: BUG #6510: A simple prompt is displayed using wrong charset |