Re: CORRUPTION on TOAST table

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Soni M <diptatapa(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: CORRUPTION on TOAST table
Date: 2016-04-03 16:06:03
Message-ID: 57013F6B.8090904@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 04/02/2016 08:38 PM, Soni M wrote:
> Hello Everyone,
>
> We face TOAST table corruption.
>
> One master and two streaming replicas. The corruption happen only on
> both streaming replicas.
>
> We did found the corrupted rows. Selecting on this row, return (on both
> replica) : unexpected chunk number 0 (expected 1) for toast value
> 1100613112 in pg_toast_112517
> selecting this row on master does not return corruption error, but
> return correct result instead.
>
> Previously, dump on a replica return : unexpected chunk number 0
> (expected 1) for toast value 3234098599 in pg_toast_112517 (please note
> the toast value is different)
>
> This table size is 343 GB, contain around 206,179,697 live tuples. We
> found that the corruption happen on the biggest column (this column and
> its pkey sized around 299 GB total).
>
> replica1 :
> ESX 5.5, VM Version 8
> Intel(R) Xeon(R) CPU E5649 @ 2.53GHz
> 8GB RAM
> Storage – Raw Disk Mapping in ESX from 3PAR 7400 SAN using Fast Class
> (10k) disk
> Each volume (single disk as presented by SAN) on the VMs is its own LVM
> volume.
>
> replica2 :
> ESX 5.5, VM Version 8
> Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
> 8GB RAM
> Raw Disk Mapping in ESX from 3PAR 7400 SAN using Fast Clkass (10k) disk
> Each volume (single disk as presented by SAN) on the VMs is its own LVM
> volume.

So where is the master data located, on the SAN or somewhere different?

To be clear about above, each replica is it own VM with its own virtual
disk/volume as served up from the same SAN, correct?

Can you elaborate more on what is actually taking place with the raw
disk mapping?

>
> on both replica :
> fsync NEVER turned off.
> none unexpected power loss nor OS crash.
>
> How can the corruption occurs ? and how can I resolve them ?
>
> Thank so much for the help.
>
> Cheers \o/
>
> --
> Regards,
>
> Soni Maula Harriz

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-04-03 16:23:23 Re: CORRUPTION on TOAST table
Previous Message Adrian Klaver 2016-04-03 15:31:52 Re: plpgsql update row from record variable