From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Jim C(dot) Nasby" <jim(at)nasby(dot)net> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: cuckoo is hung during regression test |
Date: | 2007-02-13 18:15:29 |
Message-ID: | 2308.1171390529@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Jim C. Nasby" <jim(at)nasby(dot)net> writes:
> The postmaster is stuck in the following loop, according to
> ktrace/kdump:
> 2023 postgres CALL select(0x8,0xbfffe194,0,0,0xbfffe16c)
> 2023 postgres RET select 1
> 2023 postgres CALL sigprocmask(0x3,0x2f0d38,0)
> 2023 postgres RET sigprocmask 0
> 2023 postgres CALL accept(0x7,0x200148c,0x200150c)
> 2023 postgres RET accept -1 errno 24 Too many open files
> 2023 postgres CALL write(0x2,0x2003928,0x3b)
> 2023 postgres GIO fd 2 wrote 59 bytes
> "LOG: could not accept new connection: Too many open files
> "
> 2023 postgres RET write 59/0x3b
> 2023 postgres CALL close(0xffffffff)
> 2023 postgres RET close -1 errno 9 Bad file descriptor
> 2023 postgres CALL sigprocmask(0x3,0x2e6400,0)
> 2023 postgres RET sigprocmask 0
> 2023 postgres CALL select(0x8,0xbfffe194,0,0,0xbfffe16c)
> 2023 postgres RET select 1
Interesting. So accept() fails because it can't allocate an FD, which
means that the select condition isn't cleared, so we keep retrying
forever. I don't see what else we could do though. Having the
postmaster abort on what might well be a transient condition doesn't
sound like a hot idea. We could possibly sleep() a bit before retrying,
just to not suck 100% CPU, but that doesn't really *fix* anything ...
I've been meaning to bug you about increasing cuckoo's FD limit anyway;
it keeps failing in the regression tests.
> ulimit is set to 1224 open files, though I seem to keep bumping into that
> (anyone know what the system-level limit is, or how to change it?)
On my OS X machine, "ulimit -n unlimited" seems to set the limit to
10240 (or so a subsequent ulimit -a reports). But you could probably
fix it using the buildfarm parameter that cuts the number of concurrent
regression test runs.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2007-02-13 18:18:19 | Re: Missing directory when building 8.2.3-base |
Previous Message | Andrew Hammond | 2007-02-13 18:11:08 | Re: Missing directory when building 8.2.3-base |