From: | "BJ Taylor" <btaylor(at)propertysolutions(dot)com> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | Re: missing chunk number 0 for toast value |
Date: | 2008-09-25 16:09:48 |
Message-ID: | 3d78fcfd0809250909h148daba8k2ad29e176d9a06fd@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hey Tom,
Here are some recent logs from our system. Unfortunately, I didn't think to
grab the logs at the time I killed those processes, and now they are gone.
I found those processes by using ps, and then I killed them with a simple
kill *processid*. Here are samples of our current log files:
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
PANIC: right sibling's left-link doesn't match: block 175337 links to
243096 instead of expected 29675 in index "dbmail_headervalue_3"
STATEMENT: INSERT INTO dbmail_headervalue (headername_id, physmessage_id,
headervalue) VALUES (4,12335778,'from [76.13.13.25] by
n6.bullet.mail.ac4.yahoo.com with NNFMP; 25 Sep 2008 04:01:36 -0000')
LOG: server process (PID 13888) was terminated by signal 6: Aborted
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
FATAL: the database system is in recovery mode
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted; last known up at 2008-09-25 09:12:41
MDT
LOG: database system was not properly shut down; automatic recovery in
progress
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
...
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
LOG: redo starts at 3A/2D0DEA78
LOG: record with zero length at 3A/2D1B8D68
LOG: redo done at 3A/2D1B8D3C
LOG: last completed transaction was at log time 2008-09-25
09:12:45.204162-06
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
...
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
LOG: redo starts at 3A/2D1B8DA8
LOG: unexpected pageaddr 3A/2520A000 in log file 58, segment 45, offset
2138112
LOG: redo done at 3A/2D208660
LOG: last completed transaction was at log time 2008-09-25
09:12:47.971207-06
FATAL: the database system is in recovery mode
FATAL: the database system is in recovery mode
...
LOG: unexpected EOF on client connection
LOG: unexpected EOF on client connection
ERROR: missing chunk number 0 for toast value 554365
STATEMENT: SELECT messageblk, is_header FROM dbmail_messageblks WHERE
physmessage_id = 12111760 ORDER BY messageblk_idnr
LOG: unexpected EOF on client connection
LOG: unexpected EOF on client connection
To be honest, I don't know if all of these logs are relevant or not. I half
way suspect that nagios causes the "unexpected EOF on client connection"
notices, but I can't be certain.
You also asked how it is being unstable. It drops connections seemingly at
random. The error received when a connection is dropped is the following:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Please let me know if there are any other questions I can answer for you.
Thanks,
BJ
On Thu, Sep 25, 2008 at 7:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "BJ Taylor" <btaylor(at)propertysolutions(dot)com> writes:
> > We are using version 8.3.1. And to be precise, when I started the
> vacuum
> > (analyze), I started it as a cron job to run daily around midnight. The
> > next day I came in and checked on it and it was still running. Not
> thinking
> > that it would take more than a full 24 hours to run, I let it be, and the
> > next day I came in and the server started acting weird. I believe the
> > vacuum process continued to run, and a second vacuum process was started.
> > The server became unstable, and refused incoming connections.
>
> Unstable how? What error did you get on the refused connections? What
> was showing up in the postmaster log?
>
> > At which
> > point, I killed all vacuum processes, and restarted postgresql.
>
> How did you do that killing exactly?
>
> > I believe
> > it was somewhere during this process that the database became corrupted.
> I
> > am not certain what happens when two vacuum processes run at the same
> time.
>
> Nothing of interest, it's done all the time.
>
> > That may have been the problem, or it may not have. Or it may have been
> > that I killed the vacuum process in the middle of what it was doing. One
> > way or another, the problem that we have now, is that we are unable to
> get a
> > dump of the database for backups, and the database seems less stable than
> it
> > was previously (dropping connections, and refusing connections seemingly
> at
> > random).
>
> Again, what errors are you getting exactly, and what shows up in the
> postmaster log?
>
> regards, tom lane
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>
From | Date | Subject | |
---|---|---|---|
Next Message | Aras Angelo | 2008-09-25 16:25:33 | Strange highload on server |
Previous Message | Tom Lane | 2008-09-25 13:24:08 | Re: missing chunk number 0 for toast value |