Re: Upgrading from Postgresql 9.3.8 to 9.6.10

From: Keith <keith(at)keithf4(dot)com>
To: Yuri Niyazov <yuri(at)academia(dot)edu>
Cc: Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: Upgrading from Postgresql 9.3.8 to 9.6.10
Date: 2019-04-25 22:17:05
Message-ID: CAHw75vtMsXoo-n903JZ9E8COe9FDhFQRj8mfPji1hPPFJ7cXsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Thu, Apr 25, 2019 at 4:43 PM Yuri Niyazov <yuri(at)academia(dot)edu> wrote:

> On Mon, Mar 25, 2019 at 11:44 PM Keith <keith(at)keithf4(dot)com> wrote:
>
>> On Tue, Mar 26, 2019 at 1:51 AM Yuri Niyazov <yuri(at)academia(dot)edu> wrote:
>>
>>> We have an old cluster, running 9.3.8, that we are trying to upgrade to
>>> 9.6.10.
>>>
>>> After running pg_upgrade and starting the server, and testing some
>>> common queries against it, we experienced the following error:
>>>
>>> PG::IndexCorrupted: ERROR: index "table_pkey" contains unexpected zero
>>> page at block 17021871 HINT: Please REINDEX it.
>>>
>>> So, if I am reading this correctly, there is an index that 9.3.8
>>> created, and it recognizes that index as OK, but 9.6.10 thinks that index
>>> is corrupt. Since this happened on one index, it seems reasonable to assume
>>> that this could happen on any index in that database, and we need to
>>> reindex the entire database.
>>>
>>> Now, this is a live application, and we would like to minimize
>>> continuous downtime (multiple short downtimes are fine), so what we are
>>> considering doing is a manual reindex: for each index, create a new one,
>>> and then drop the old one, and after all that, upgrade.
>>>
>>> However, we are leery of doing this reindexing using 9.3.8, since it's
>>> already demonstrated itself to be unreliable.
>>>
>>> Which version should we use to reindex? Just the latest 9.3.X? Was there
>>> a known bug with older versions missing corrupted pages in indices?
>>>
>>> PS: Just in case I am glaringly doing something wrong:
>>>
>>> Here's our mechanism of doing the upgrade: we create a streaming replica
>>> from a basebackup, and then at some point turn off writes to the primary,
>>> convert the replica to another primary, and then run pg_upgrade on this new
>>> primary.
>>>
>>
>>
>> Have you done this more than once? If so, is it the same indexes every
>> time you're having issues with?
>>
>> Answering these questions will help you narrow down if the corruption is
>> in your original database or something to do with the upgrade.
>>
>> However, I would also highly recommend getting to the latest version of
>> 9.3 and rebuilding your replicas afterwards. The latest patch release was
>> 9.3.25, so your current version is quite far behind. There were quite a
>> number of data corruption issues related to replication in the 9.3 series.
>> More-so in the earlier patch versions, but I highly recommend also trying
>> this before your major upgrade.
>>
>> Keith
>>
>
> We tried it a second time, and were unable to reproduce the issue, so it
> seems like the issue is non-deterministic, which is a little scary. In any
> case, we then proceeded to upgrade to 9.3.25 as you recommended, and then
> upgraded to the 9.6 series from it, and the issue didn't reproduce; we've
> been running on the new upgraded version for thel last two weeks and it
> looks ok so far.
>
> Thanks for your help!
>
>
Yuri,

Glad things are working, but I would highly recommend running a reindex on
all tables in your cluster at some point, just as a precaution. It may not
hurt to also run a pg_dump/restore of the database to ensure it doesn't
find any corrupt data during that process as well. Don't mean that you have
to rebuild your current system, just make a dump and restore it to a
secondary system somewhere and make sure it all works ok.

Keith

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message arvind chikne 2019-04-26 06:43:55 Re: BDR
Previous Message Yuri Niyazov 2019-04-25 20:43:25 Re: Upgrading from Postgresql 9.3.8 to 9.6.10