Re: VMWare file system / database corruption

From: Tom Duffey <tduffey(at)techbydesign(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: VMWare file system / database corruption
Date: 2009-09-21 18:46:56
Message-ID: A755A57B-107F-420D-9956-0B5BA3ABD8A7@techbydesign.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Sep 21, 2009, at 12:40 PM, Scott Marlowe wrote:

> On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey
> <tduffey(at)techbydesign(dot)com> wrote:
>> Hi All,
>>
>> We're having numerous problems with a PostgreSQL 8.3.7 database
>> running on a
>> virtual Linux server w/VMWare ESX. This is not by choice and I
>> have been
>> asking the operator of this equipment for details about the disk
>> setup and
>> here's what I got:
>>
>> "We have a SAN that is presenting an NFS share. VMWare sees that
>> share and
>> reads the VMDK file that make up the virtual file system."
>>
>> Does anyone with a better understanding of PostgreSQL and VMWare
>> know if
>> this is an unreliable setup for PostgreSQL? I see things like
>> "NFS" and
>> "VMWare" and start to get worried.
>
> I see VMWare and thing performance issues, I see NFS and thing dear
> god help us all. Even if properly setup NFS is a problem waiting to
> happen, and it's not reliable storage for a database in my opinion.
> That said, lots of folks do it. Ask for the NFS mount options from
> the sysadmin.

Thanks to everyone so far for the insight. I'm trying to get more
details about the hardware setup but am not making much progress.

Here are some of the errors we're getting. I searched through
archives and they all seem to point at hardware trouble but is there
anything else I should be looking at?

ERROR: invalid page header in block 2 of relation
"pg_toast_19466_index"

ERROR: invalid memory alloc request size 1667592311
STATEMENT: COPY public.version_bundle (node_id_hi, node_id_lo,
bundle_data) TO stdout;

ERROR: unexpected chunk number 1632 (expected 1629) for toast value
19711 in pg_toast_19184
STATEMENT: COPY public.data_binval (binval_id, binval_data) TO stdout;

ERROR: invalid page header in block 414 of relation
"pg_toast_19460_index"

ERROR: could not open segment 1 of relation 1663/16386/16535 (target
block 3966127611): No such file or directory

I dealt with some of the above by reindexing or finding and deleting
bad rows. I can now successfully dump the database but of course have
missing data so the application is toast. What I'm really wondering
now is how to prevent this from happening again and if that means
moving the database to new hardware.

Best Regards,

Tom

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2009-09-21 19:04:09 Re: VMWare file system / database corruption
Previous Message Alex Gadea 2009-09-21 18:10:57 Re: VMWare file system / database corruption