From: | Michael Brusser <michael(at)synchronicity(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pgsql-Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: database errors |
Date: | 2004-05-14 00:08:55 |
Message-ID: | DEEIJKLFNJGBEMBLBAHCKEHBEKAA.michael@synchronicity.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
It looks that "No such file or directory" followed by the abort signal
resulted from manually removing logs. pg_resetxlog took care of this,
but other problems persisted.
I got a copy of the database and installed it on the local partition.
It does seem badly corrupted, these are some hard errors.
pg_dump fails and dumps the core:
pg_dump: ERROR: XLogFlush: request 0/A971020 is not satisfied ---
flushed only to 0/5000050 ... lost synchronization with server, resetting
connection
looking at the core file:
(dbx) where 15
=>[1] _libc_kill(0x0, 0x6, 0x0, 0xffffffff, 0x2eaf00, 0xff135888), at
0xff19f938
[2] abort(0xff1bc004, 0xff1c3a4c, 0x0, 0x7efefeff, 0x21c08, 0x2404c4), at
0xff13596c
[3] elog(0x14, 0x267818, 0x0, 0xa971020, 0x0, 0x5006260), at 0x2407dc
[4] XLogFlush(0xffbee908, 0xffbee908, 0x827e0, 0x0, 0x0, 0x0), at 0x78530
[5] BufferSync(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x18df2c
[6] FlushBufferPool(0x2, 0x1e554, 0x0, 0x30000, 0x0, 0xffbeea79), at
0x18e5c4
[7] CreateCheckPoint(0x0, 0x0, 0x82c00, 0xff1bc004, 0x2212c, 0x83534), at
0x7d93c
[8] BootstrapMain(0x5, 0xffbeec50, 0x10, 0xffbeec50, 0xffbeebc8,
0xffbeebc8), at 0x836bc
[9] SSDataBase(0x3, 0x40a24a8a, 0x2e3800, 0x4, 0x2212c, 0x16f504), at
0x172590
[10] ServerLoop(0x5091, 0x2e398c, 0x2e3800, 0xff1c2940, 0xff1bc004,
0xff1c2940), at 0x16f3a0
[11] PostmasterMain(0x1, 0x323ad0, 0x2af000, 0x0, 0x65720000, 0x65720000),
at 0x16ef88
[12] main(0x1, 0xffbef68c, 0xffbef694, 0x2eaf08, 0x0, 0x0), at 0x12864c
======================
(I don't have the debug build at the moment to get more details)
this query fails:
LOG: query: select count (1) from note_links_aux;
ERROR: XLogFlush: request 0/A971020 is not satisfied --- flushed only to
0/5006260
drop table fails:
drop table note_links_aux;
ERROR: getObjectDescription: Rule 17019 does not exist
Are there any pointers as to why this could happen, aside
of potential memory and disk problems?
As for NFS... I know how strong the Postgresql community is advising
against it, but we have to face it: our customers ARE running on NFS
and they WILL be running on NFS.
Is there such a thing as "better" and "worse" NFS versions?
(I made a note of what was said about hard mount vs. soft mount, etc)
Tom, you recommended upgrade from 7.3.2 to 7.3.6
Out next release is using v 7.3.4. (maybe it's not too late to upgrade)
Would v. 7.3.6 provide more protection against problems like this?
Thank you,
Mike
> -----Original Message-----
... ...
> The messages you quote certainly read like a badly corrupted database to
> me. In the case of a local filesystem I'd be counseling you to start
> running memory and disk diagnostics. That may still be appropriate
> here, but you had better also reconsider the decision to use NFS.
>
> If you're absolutely set on using NFS, one possibly useful tip is to
> make sure it's a hard mount not a soft mount. If your systems support
> NFS-over-TCP instead of UDP, that might be worth trying too.
>
> Also I would strongly advise an update to PG 7.3.6. 7.3.2 has serious
> known bugs.
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2004-05-14 00:11:45 | Re: [HACKERS] threads stuff/UnixWare |
Previous Message | Larry Rosenman | 2004-05-14 00:06:32 | Re: [HACKERS] threads stuff/UnixWare |