Re: Corrupted Data ?

From: Melvin Davidson <melvin6925(at)gmail(dot)com>
To: Ioana Danes <ioanadanes(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Corrupted Data ?
Date: 2016-08-08 16:53:34
Message-ID: CANu8FiwzJhZVWL-6R6WyVAeO6JVAv=dVEaavEO8oGsJP6NYCrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Aug 8, 2016 at 12:47 PM, Ioana Danes <ioanadanes(at)gmail(dot)com> wrote:

>
>
> On Mon, Aug 8, 2016 at 12:37 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
> wrote:
>
>> On 08/08/2016 09:28 AM, Ioana Danes wrote:
>>
>>>
>>>
>>> On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver
>>> <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>>>
>>> On 08/08/2016 09:11 AM, Ioana Danes wrote:
>>>
>>> Hi,
>>>
>>> I suspect I am having a case of data corruption. Here are the
>>> details:
>>>
>>> I am running postgres 9.4.8:
>>>
>>> postgresql94-9.4.8-1PGDG.rhel7.x86_64
>>> postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
>>> postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
>>> postgresql94-server-9.4.8-1PGDG.rhel7.x86_64
>>>
>>> on CentOS Linux release 7.2.1511 (Core)
>>>
>>> This is happening in a production environment but luckily on the
>>> reporting database.
>>> I have a cluster of 3 databases, db1 and db2 are masters and
>>> replicate
>>> between each other and also replicate to db3 (db1 <-> db2, db1
>>> -> db3,
>>> db2 -> db3).
>>> For replication I am using Bucardo.
>>>
>>>
>>> I would say this is more a question for the Burcardo list:
>>>
>>> https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>>> <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>
>>>
>>> I am just not seeing that replicating two masters on to a single
>>> database is going to end well.
>>>
>>>
>>> Only one master is active at one time the other one is in stand by that
>>> is a topic for another discussion but in our case that works well.
>>>
>>> That was my first assumption, that it is a kind of a race condition or a
>>> bug on replication but I quickly ruled that out because that does not
>>> explain why when I filtered the table by transactionid = 75315815 it
>>> shows one record with transactionid 75315811...
>>>
>>> select gameplayid, transactionid, encodedplay from abrazo.matchgameplay
>>> where transactionid in (75315815) order by transactionid;;
>>> gameplayid | transactionid | encodedplay
>>> ------------+---------------+--------------
>>> 160019271 | 75315815 | mix:9,0,9
>>> 160019269 | 75315815 | mix:9,8,9
>>> 160019267 | 75315815 | mix:9,2,2
>>> 160019265 | 75315815 | mix:2,2,8
>>> 160019263 | *75315811 *| backup:1,9,1
>>> 160019261 | 75315815 | backup:2,0,9
>>>
>>> So I don't think it is a replication issue...
>>>
>>>
>> Other that, if I am following correctly, it is on the database(db3) being
>> replicated to. The only way db3 is getting its data is through replication,
>> is that correct?. On the master databases the data is correct.
>>
>> OK, let's assume that what you're saying is correct and the replication
> has a bug, or corruption or whatever that is and the record gets created
> with transactionid = 75315811. Bucardo replication is trigger based and it
> is using a copy command to insert the new records into the replicated
> database.
>
> Then how can I explain that my query select gameplayid, transactionid,
> encodedplay from abrazo.matchgameplay where transactionid in (75315815)
> order by transactionid; returns me a record with transactionid 75315811???
>
> Thanks,
> ioana
>
>
>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com
>>
>
>
You did provide the full table structure, but this might possibly be a case
of index corruption.
Try executing the following query and if any show up as invalid, just drop
and rebuild.

SELECT n.nspname,
i.relname,
i.indexrelname,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
'INVALID'
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE idx.indisvalid = FALSE
ORDER BY 1, 2;

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-08-08 16:55:18 Re: Corrupted Data ?
Previous Message Ioana Danes 2016-08-08 16:47:59 Re: Corrupted Data ?