Re: Problems with 8.3

From: "Alex Turner" <armtuk(at)gmail(dot)com>
To: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Cc: "Richard Huxton" <dev(at)archonet(dot)com>, "PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Problems with 8.3
Date: 2008-03-08 04:42:52
Message-ID: 33c6269f0803072042w35476925hf415003a555c42fe@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Well - I think it might be that some of my servlets weren't closing
their database connections properly.

I do have some new evidence though:

I did an strace of the tomcat processes, and I noticed something that
might be odd, but I'm not really qualified to say. I notice that
every time a socket sends a request to Postgresql it gets some kind of
reply. This is true in all cases EXCEPT when the application crashes.
Here is the segment of the strace right before it throws a wobbly:

[pid 4565] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 156
[pid 4565] bind(156, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 4565] getsockname(156, {sa_family=AF_INET,
sin_port=htons(56550), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 4565] connect(156, {sa_family=AF_INET, sin_port=htons(5432),
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid 4565] setsockopt(156, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid 4565] send(156, "\0\0\0W\0\3\0\0user\0postgres\0database\0t"...,
87, 0) = 87
[pid 4565] recv(156,
"R\0\0\0\10\0\0\0\0S\0\0\0\34client_encoding\0UN"..., 8192, 0) = 279
[pid 4565] gettimeofday({1204948966, 386187}, NULL) = 0
[pid 4565] send(156, "P\0\0\1\35\0\r\n \t\tselect"...,
334, 0) = 334
[pid 4565] recv(156, "", 8192, 0) = 0
[pid 4565] send(156, "X\0\0\0\4", 5, 0) = 5
[pid 4565] dup2(11, 156) = 156
[pid 4565] close(156) = 0

Notice that the recv(156,... after sending the query comes back blank
which seems odd given that we just sent a query to the database.

I'm really in bind with this one. It started happening a couple of
days ago at this point, and all our admin applications are basically
down :(, people can't even log the bugs that this is generating
because the bugtrac (trac) is running on this postgresql and is
throwing errors too.

I also caught something else that seemed wierd on another trace:

[pid 3553] send(28, "P\0\0\0H\0delete from result_cache w"..., 108, 0) = 108
[pid 3553] recv(28, "N\0\0\1\202SWARNING\0C57P02\0Mterminatin"...,
8192, 0) = 387
[pid 3553] gettimeofday({1204946902, 977641}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977682}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977766}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977902}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 977973}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978012}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978053}, NULL) = 0
[pid 3553] gettimeofday({1204946902, 978091}, NULL) = 0
[pid 3553] recv(28, "", 8192, 0) = 0
[pid 3553] send(28, "X\0\0\0\4", 5, 0) = -1 EPIPE (Broken pipe)
[pid 3553] --- SIGPIPE (Broken pipe) @ 0 (0) ---
[pid 3553] rt_sigreturn(0x9) = -1 EPIPE (Broken pipe)

I couldn't reproduce this though. It just randomly throws a SIGPIPE
after the query. The other wierd thing is that this process also
throws a SIGSEGV at another point. I wasn't expecting tomcat to
crash, so alas I didn't capture a core file. I guess I should set the
system default up.

Alex

On Fri, Mar 7, 2008 at 2:28 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
> On Fri, Mar 7, 2008 at 11:17 AM, Alex Turner <armtuk(at)gmail(dot)com> wrote:
> > I didn't. And after the reboot, I still see 8 new sockets stuck in
> > CLOSE_WAIT - I'm wondering if this is a hardware/kernel problem...
>
> Having sockets in CLOSE_WAIT is actually pretty normal
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alex Turner 2008-03-08 05:23:13 Re: Problems with 8.3
Previous Message Joshua D. Drake 2008-03-08 02:51:25 Re: Watch your PlanetPostgreSQL.org blogs