From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Mark Dilger <markdilger(at)yahoo(dot)com> |
Cc: | deepak <deepak(dot)pn(at)gmail(dot)com>, Alban Hertroys <haramrae(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: FATAL: lock file "postmaster.pid" already exists |
Date: | 2012-05-23 18:17:30 |
Message-ID: | 8405.1337797050@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Mark Dilger <markdilger(at)yahoo(dot)com> writes:
> Prior to posting to the mailing list, we made some
> changes in postmaster.c to identify where time was
> being spent. Based on the elog(NOTICE,...) lines
> we put in the file, we determined the time was spent
> inside RemovePgTempFiles.
> I then altered RemovePgTempFiles to take a starttime
> parameter and, while recursing, to check if more than
> 5 seconds has passed since it started. I did not want
> to add the complexity of setting an alarm and catching
> the signal, so I just made the code check the wallclock
> time at each step of the recursion. When more than
> 5 seconds has passed, it does not recurse further.
> After making this change, we have not been able to
> reproduce the slowness.
OK, so we're back to the original question: how could this possibly be
taking that long? Have you got thousands of tablespaces (and if so why)?
Does your system have a habit of crashing at times when there are
thousands of temp files? Maybe you're using IP over avian carriers to
access your SAN? It just doesn't make any sense given the information
you've provided.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Dilger | 2012-05-23 18:38:58 | Re: FATAL: lock file "postmaster.pid" already exists |
Previous Message | Mark Dilger | 2012-05-23 17:38:26 | Re: FATAL: lock file "postmaster.pid" already exists |