From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [pgsql-hackers] Daily digest v1.9418 (15 messages) |
Date: | 2009-08-27 16:47:30 |
Message-ID: | f67928030908270947h10862c70h9e3a4c59ab21f337@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>
> ---------- Forwarded message ----------
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> To: Robert Haas <robertmhaas(at)gmail(dot)com>
> Date: Thu, 27 Aug 2009 10:11:24 -0400
> Subject: Re: 8.5 release timetable, again
>
> What I'd like to see is some sort of test mechanism for WAL recovery.
> What I've done sometimes in the past (and recently had to fix the tests
> to re-enable) is to kill -9 a backend immediately after running the
> regression tests, let the system replay the WAL for the tests, and then
> take a pg_dump and compare that to the dump gotten after a conventional
> run. However this is quite haphazard since (a) the regression tests
> aren't especially designed to exercise all of the WAL logic, and (b)
> pg_dump might not show the effects of some problems, particularly not
> corruption in non-system indexes. It would be worth the trouble to
> create a more specific test methodology.
I hacked mdwrite so that it had a static int counter. When the counter hit
400 and if the guc_of_death was set, it would write out a partial block (to
simulate a partial page write) and then PANIC. I have some Perl code that
runs against the database doing a bunch of updates until the database dies.
Then when it can reconnect again it makes sure the data reflects what Perl
thinks it should. This is how I (belatedly) found and traced down the bug
in the visibility bit. (What I was trying to do is determine if my toying
around with XLogInsert was breaking anything. Since the regression suit
wouldn't show me a problem if one existed, I came up with this. Then I
found things were broken even before I started toying with it...)
I don't know how lucky I was to hit open a test that found an already
existing bug. I have to assume I was somewhat lucky, simply because it took
a run of many hours or overnight (with a simulated crash every 2 minutes or
so) to reliably detect the problem. But how do you turn something like this
into a regression test? Scattering the code with intentional crash inducing
code that is there to exercise the error recover parts seems like it would
be quite a mess.
> In short: merely making the tests bigger doesn't impress me in the
> least. Focused testing on areas we aren't covering at all could be
> worth the trouble.
Do you have suggestions on what other areas need it?
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-08-27 17:08:55 | Re: pretty print viewdefs |
Previous Message | Jaime Casanova | 2009-08-27 16:35:49 | Re: MySQL Compatibility WAS: 8.5 release timetable, again |