Re: postmaster dies (was Re: Very disappointing performance)

From: secret <secret(at)kearneydev(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)hub(dot)org
Subject: Re: postmaster dies (was Re: Very disappointing performance)
Date: 1999-03-16 14:05:02
Message-ID: 36EE650D.4C0818D2@kearneydev.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

> secret <secret(at)kearneydev(dot)com> writes:
> >>>> PostgreSQL is also crashing 1-2 times a day on me, although I have a
> >>>> handy perl script to keep it alive now <grin>...
>
> > basically the server randomly dies with a:
> > ERROR: postmaster: StreamConnection: accept: Invalid argument
> > pmdie 3
> > (then signals all children to drop dead)
>
> Hmm. That shouldn't happen, especially not randomly; if the accept
> works the first time then it should work forever after, since the
> arguments being passed in never change.
>
> The error is coming from StreamConnection() in
> pgsql/src/backend/libpq/pqcomm.c. Could you maybe add some debugging
> code to the routine to see what the server_fd and port arguments are
> when accept() fails? I think just changing the first elog() to
>
> elog(ERROR,
> "postmaster: StreamConnection: accept: %m\nserver_fd = %d, port = %p",
> server_fd, port);
>
> would do for starters. This would let us eliminate the possibility that
> the routine is getting passed bad arguments.
>
> An alternative possibility is to run the postmaster under truss so you
> can see what arguments are passed to the kernel on every kernel call,
> but that'd generate a pretty verbose logfile.
>
> regards, tom lane

query: SELECT "material_id" ,"name" ,"short_name" ,"legacy" FROM "material"
ORDE
R BY "legacy" DESC,"name"
ProcessQuery
! system usage stats:
! 0.017961 elapsed 0.020000 user 0.000000 system sec
! [0.050000 user 0.020000 sys total]
! 0/0 [0/0] filesystem blocks in/out
! 6/24 [127/201] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
! Shared blocks: 0 read, 0 written, buffer hit rate =
10
0.00%
! Local blocks: 0 read, 0 written, buffer hit rate =
0.
00%
! Direct blocks: 0 read, 0 written
CommitTransactionCommand
ERROR: postmaster: StreamConnection: accept: Invalid argument
server_fd = 3, port = 0x816aa70
pmdie 3
SignalChildren: sending signal 15 to process 16943
SignalChildren: sending signal 15 to process 16942
SignalChildren: sending signal 15 to process 16941

There we go, it crashed this morning...(interestingly it went all of
yesterday without crashing)... Does this shed some light? If not what would
you like me to do next? I have 700M+ to keep a log file, as long as it doesn't
generate that much in a day we should be okay with a very verbose log.

Just tell me what code mods or runtime options to use...

David Secret
MIS Director
Kearney Development Co., Inc.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nugent, Michael P (SAIC) 1999-03-16 15:48:19
Previous Message The Hermit Hacker 1999-03-16 12:49:56 Re: [HACKERS] Developers Globe (FINAL)