From: | Mike Mascari <mascarm(at)mascari(dot)com> |
---|---|
To: | kimi(at)intercept(dot)co(dot)in |
Cc: | pgsql-general(at)postgreSQL(dot)org, scrappy(at)hub(dot)org |
Subject: | Re: [GENERAL] Release LRU file |
Date: | 1999-12-21 16:16:32 |
Message-ID: | 385FA7E0.C9C26701@mascari.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Kimi wrote:
>
> Hi,
>
> This is in continuation of mails I sent last week about postgres
> crashing
> We are running pg 6.5.1, on Redhar 5.1 with DBI 0.92 and DBD 1.13 on a
> 512 MB RAM
> and SCSI machine
>
> Our application consists of requests going upto 150 per second on this
> database
> with an expected uptime of 24 by 7.
> Earlier we were getting spinlock messages which we have hoped to sort
> out by raising
> number of open files per process to 1024 from the earlier 256
>
> Postgres crashes giving an error message : FATAL 1: Release LRU file :
> No opened files /
> no one can be closed.
>
> Now can anybody help on how to solve this.
>
> Please help
>
> Bye,
>
> Murali
> Differentiated Software Solutions
We have been running a production server under a somewhat
lighter load, and encountered this once. The following
conversation took place on the mailing list about a month
ago:
http://www.PostgreSQL.ORG/mhonarc/pgsql-hackers/1999-11/msg00454.html
------------------------------------------------------------
Mike Mascari <mascarim(at)yahoo(dot)com> writes:
> FATAL 1: ReleaseLruFile: No opened files - no one can be closed
> This is the first time this has ever happened.
I've never seen that either. Offhand I do not recall any
post-6.5
changes that would affect it, so the problem (whatever it
is) is
probably still there.
After eyeballing the code, it seems there are only two ways
this
could happen:
1. the number of "allocated" (non-virtual) file descriptors
grew to
exceed the number of files Postgres thinks it can have open;
2. something else was temporarily exhausting your kernel's
file table
space, so that ENFILE was returned for many successive
attempts to
open a file. (After each one, fd.c will close another file
and try
again.)
#2 seems improbable on an unloaded system, and isn't real
probable even
on a loaded one, since you'd have to assume that some other
process
managed to suck up each filetable slot that fd.c released
before fd.c
could re-acquire it. Once, yes, but several dozen times in
a row?
So I'm guessing a leak of allocated file descriptors.
After grovelling through the calls to AllocateFile, I only
see one
prospect for a leak: it looks to me like verify_password()
neglects
to close the password file if an invalid user name is
given. Do you
use a plain (non-encrypted) password file? If so, I'll bet
you can
reproduce the crash by trying repeatedly to connect with a
username
that's not in the password file. If that pans out, it's a
simple fix:
add "FreeFile(pw_file);" near the bottom of
verify_password() in
src/backend/libpq/password.c. Let me know if this guess is
right...
regards, tom lane
------------------------------------------------------------
Hope that helps,
Mike Mascari
From | Date | Subject | |
---|---|---|---|
Next Message | Mike Mascari | 1999-12-21 16:20:37 | Re: [GENERAL] item descriptions in psql |
Previous Message | Kimi | 1999-12-21 15:40:08 | Release LRU file |