From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Daniel Farina <daniel(at)heroku(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enabling Checksums |
Date: | 2013-03-05 09:01:50 |
Message-ID: | CA+U5nMJY4+JLnFBQOMtthpd5fSTS-=YC9oBqbierUJ8eBFhc6Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 5 March 2013 01:04, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> Corruption has easily occupied more than one person-month of time last
> year for us. This year to date I've burned two weeks, although
> admittedly this was probably the result of statistical clustering.
> Other colleagues of mine have probably put in a week or two in
> aggregate in this year to date. The ability to quickly, accurately,
> and maybe at some later date proactively finding good backups to run
> WAL recovery from is one of the biggest strides we can make in the
> operation of Postgres. The especially ugly cases are where the page
> header is not corrupt, so full page images can carry along malformed
> tuples...basically, when the corruption works its way into the WAL,
> we're in much worse shape. Checksums would hopefully prevent this
> case, converting them into corrupt pages that will not be modified.
>
> It would be better yet if I could write tools to find the last-good
> version of pages, and so I think tight integration with Postgres will
> see a lot of benefits that would be quite difficult and non-portable
> when relying on file system checksumming.
>
> You are among the most well-positioned to make assessments of the cost
> of the feature, but I thought you might appreciate a perspective of
> the benefits, too. I think they're large, and for me they are the
> highest pole in the tent for "what makes Postgres stressful to operate
> as-is today." It's a testament to the quality of the programming in
> Postgres that Postgres programming error is not the largest problem.
That's good perspective.
I think we all need to be clear that committing this patch also
commits the community (via the committer) to significant work and
responsibility around this, and my minimum assessment of it is 1 month
per year for a 3-5 years, much of that on the committer. In effect
this will move time and annoyance experienced by users of Postgres
back onto developers of Postgres. That is where it should be, but the
effect will be large and easily noticeable, IMHO.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2013-03-05 09:22:08 | Re: 9.2.3 crashes during archive recovery |
Previous Message | Simon Riggs | 2013-03-05 08:50:39 | Re: Materialized views WIP patch |