From: | Serge Negodyuck <petr(at)petrovich(dot)kiev(dot)ua> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby |
Date: | 2013-12-18 14:20:48 |
Message-ID: | CABKyZDEENX2X5HMqNtMB34zBAr2UZxrcs4SbXx169xDRKeZ4DA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
2013/12/10 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:
> Andres Freund wrote:
>
>> > > I think problems should be preventable if you issue a systemwide VACUUM
>> > > FREEZE, but please let others chime in before you execute it.
>> >
>> > I wouldn't freeze anything just yet, at least until the patch to fix
>> > multixact freezing is in.
>>
>> Well, it seems better than getting errors because of multixact members
>> that are gone.
>> Maybe PGOPTIONS='-c vacuum_freez_table_age=0 -c vacuum_freeze_min_age=1000000 vacuumdb -a'
>> - that ought not to cause problems with current data and should freeze
>> enough to get rid of problematic multis?
>
> TBH I don't feel comfortable with predicting what will it freeze with
> the broken code.
>
You guys were right. After a week this issue occured again on almost
all slave servers.
slave:
2013-12-17 14:21:20 MSK CONTEXT: xlog redo delete: index
1663/16516/5320124; iblk 8764, heap 1663/16516/18816;
2013-12-17 14:21:20 MSK LOG: file "pg_clog/0370" doesn't exist,
reading as zeroes
2013-12-17 14:21:20 MSK FATAL: MultiXactId 1819308905 has not been
created yet -- apparent wraparound
2013-12-17 14:21:20 MSK CONTEXT: xlog redo delete: index
1663/16516/5320124; iblk 8764, heap 1663/16516/18816;
2013-12-17 14:21:20 MSK LOG: startup process (PID 13622) exited with exit code 1
I had to do fix something o master since all slaves were affected. So
the only idea was do perform VACUUM FREEZE on master.
I believe that was not a good idea. I suppose "vacuum freeze" leaded
to following errors on master:
2013-12-17 13:15:34 EET 172.18.10.44 ruprom ERROR: could not access
status of transaction 8407326
2013-12-17 13:15:34 EET 172.18.10.44 ruprom DETAIL: Could not open
file "pg_multixact/members/A458": No such file or directory.
The only way out was to perform full backup/restore, which did not
succeed with teh same error (could not access status of transaction
xxxxxxx)
A very ugly hack was to copy pg_multixact/members/0000 ->
pg_multixact/members/[ABCDF]xxx, it helped to do full backup, but not
sure about consistency of data.
My question is are there any quick-and-dirty solution to disable
pg_multixact deletion? I understand it may lead to waste of space.
From | Date | Subject | |
---|---|---|---|
Next Message | David Fleischhauer | 2013-12-18 15:01:52 | Re: permission issues with PostgreSQL 9.2 EnterpriseDB one-click installer on windows 7 causes initcluster to fail |
Previous Message | Sandeep Thakkar | 2013-12-18 07:58:13 | Re: permission issues with PostgreSQL 9.2 EnterpriseDB one-click installer on windows 7 causes initcluster to fail |
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2013-12-18 14:45:24 | Re: SQL objects UNITs (was: Extension Templates S03E11) |
Previous Message | Andrew Dunstan | 2013-12-18 14:03:20 | Re: [PATCH] SQL assertions prototype |