From: | "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> |
---|---|
To: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Explaining duplicate rows in spite of unique index |
Date: | 2010-02-24 09:12:38 |
Message-ID: | D960CB61B694CF459DCFB4B0128514C20393819F@exadv11.host.magwien.gv.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I wrote:
> We recently found a couple of rows in a production database
> that had identical values in the columns constituting the primary key
> (The problem surfaced because a pg_dump could not be restored).
>
> Now I'm looking for explanations how this could happen.
>
> The rows originate from around the time when we had a hardware
> failure that corrupted the file system. The database came up
> after a file system check, and people continued working until
> we noticed that some tables were corrupted.
>
> At that point we restored an online backup and recovered past
> the time of the hardware failure. The WALs were intact and recovery
> completed successfully.
>
> Now does the following explanation sound plausible:
[...]
For the record, I set up a small experiment and could reproduce the
corruption like this:
- initdb new cluster, start the postmaster
- create a table with a primary key constraint
- perform an online backup
- stop the postmaster
- cp -p pg_clog/0000 pg_clog/0000.bak
- start the postmaster
- insert a record, commit
- stop the postmaster
- mv pg_clog/0000.bak pg_clog/0000
- start the postmaster
- Insert and commit a second record with the same primary key.
This will succeed since the previous commit is "gone".
- stop the postmaster
- perform PITR from the online backup to the end of WALs
The PITR will succeed, and you end up with two rows with the
same primary key, but only the second one is in the unique index.
Here's what I learned:
- Don't let anybody work on a database after a hardware problem
that affected the integrity of the file system, even if the
postmaster comes up without complaining.
- Never perform point-in-time-recovery beyond the time when
the hardware problem happened.
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | Luigi Antognini | 2010-02-24 09:26:02 | Transaction isolation when applying DDLs |
Previous Message | Albe Laurenz | 2010-02-24 08:58:28 | Re: |