Re: Memory Errors

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Sam Nelson <samn(at)consistentstate(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Memory Errors
Date: 2010-09-09 14:14:08
Message-ID: AANLkTikPw+BC+XSg0H9V8Gz5rdT0vcX96UmAZGQdoE3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Sep 8, 2010 at 6:55 PM, Sam Nelson <samn(at)consistentstate(dot)com> wrote:
> Even if the corruption wasn't a result of that, we weren't too excited about
> the process being there to begin with.  We thought there had to be a better
> solution than just killing the processes.  So we had a discussion about the
> intent of that script and my boss dealt with something that solved the same
> problem without killing queries, then had them stop that daemon and we have
> been working with that database to make sure it doesn't go screwy again.  No
> new corruption has shown up since stopping that daemon.
> That memory allocation issue looked drastically different from the toast
> value errors, though, so it seemed like a separate problem.  But now it's
> looking like more corruption.
> ---
> We're requesting that they do a few things (this is their production
> database, so we usually don't alter any data unless they ask us to),
> including deleting those rows.  My memory is insufficient, so there's a good
> chance that I'll forget to post back to the mailing list with the results,
> but I'll try to remember to do so.
> Thank you for the help - I'm sure I'll be back soon with many more
> questions.

Any information on repeatable data corruption, whether it is ec2
improperly flushing data on instance resets, postgres misbehaving
under atypical conditions, or bad interactions between ec2 and
postgres is highly valuable. The only cases of 'understandable' data
corruption are hardware failures, sync issues (either fsync off, or
fsync not honored by hardware), torn pages on non journaling file
systems, etc.

Naturally people are going to be skeptical of ec2 since you are so
abstracted from the hardware. Maybe all your problems stem from a
single explainable incident -- but we definitely want to get to the
bottom of this...please keep us updated!

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2010-09-09 14:15:39 Re: error while autovacuuming
Previous Message tamanna madaan 2010-09-09 13:55:01 Re: error while autovacuuming