From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: regression test failed when enabling checksum |
Date: | 2013-04-01 17:37:47 |
Message-ID: | CAMkU=1xUza1A3WLRUTQHrP4WaOW1G2y4QK=uoac0NXv9pSk0pQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 26, 2013 at 4:23 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Tue, 2013-03-26 at 02:50 +0900, Fujii Masao wrote:
> > Hi,
> >
> > I found that the regression test failed when I created the database
> > cluster with the checksum and set wal_level to archive. I think that
> > there are some bugs around checksum feature. Attached is the
> regression.diff.
>
> Thank you for the report. This was a significant oversight, but simple
> to diagnose and fix.
>
> There were several places that were doing something like:
>
> PageSetChecksumInplace
> if (use_wal)
> log_newpage
> smgrextend
>
> Which is obviously wrong, because log_newpage set the LSN of the page,
> invalidating the checksum. We need to set the checksum after
> log_newpage.
>
> Also, I noticed that copy_relation_data was doing smgrread without
> validating the checksum (or page header, for that matter), so I also
> fixed that.
>
> Patch attached. Only brief testing done, so I might have missed
> something. I will look more closely later.
>
After applying your patch, I could run the stress test described here:
http://archives.postgresql.org/pgsql-hackers/2012-02/msg01227.php
But altered to make use of initdb -k, of course.
Over 10,000 cycles of crash and recovery, I encountered two cases of
checksum failures after recovery, example:
14264 SELECT 2013-03-28 13:08:38.980 PDT:WARNING: page verification
failed, calculated checksum 7017 but expected 1098
14264 SELECT 2013-03-28 13:08:38.980 PDT:ERROR: invalid page in block 77
of relation base/16384/2088965
14264 SELECT 2013-03-28 13:08:38.980 PDT:STATEMENT: select sum(count) from
foo
In both cases, the bad block (77 in this case) is the same block that was
intentionally partially-written during the "crash". However, that block
should have been restored from the WAL FPW, so its fragmented nature should
not have been present in order to be detected. Any idea what is going on?
Unfortunately I already cleaned up the data directory before noticing the
problem, so I have nothing to post for forensic analysis. I'll try to
reproduce the problem.
Without the initdb -k option, I ran it for 30,000 cycles and found no
problems. I don't think this is because the problem exists but is going
undetected, because my test is designed to detect such problems--if the
block is fragmented but not overwritten by WAL FPW, that should
occasionally lead to detectable inconsistent tuples.
I don't think your patch caused this particular problem, but it merely
fixed a problem that was previously preventing me from running my test.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2013-04-01 18:00:07 | Re: pgbench --startup option |
Previous Message | Merlin Moncure | 2013-04-01 16:55:07 | Re: Page replacement algorithm in buffer cache |