From: | Michael Clark <codingninja(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Sebastien Boisvert <sebastienboisvert(at)yahoo(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: postmaster.pid file auto-clean up? |
Date: | 2012-08-26 04:56:57 |
Message-ID: | CACAT_AcbbfeG47a7apc7goR8bqcwpsJQwfzXYsGLxrPhTAtraQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, Aug 20, 2012 at 11:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sebastien Boisvert <sebastienboisvert(at)yahoo(dot)com> writes:
> > Is this mechanism documented anywhere (besides source code)?
>
> No, not really.
>
> > It looks like PG will only clean it up if there's no other process
> running at all on the pid listed in the postmaster.pid file, even if any
> process running on that pid isn't a PG process or there's no server running
> on the data directory (as per `pg_ctl status`).
>
> Not sure what you're looking at, but the above is wrong in at least one
> critical detail, namely that there's a process-ownership check via
> kill(). There are also checks to ensure no children of the previous
> postmaster are still alive. These are not things you want to lightly
> bypass: two sets of postmaster children running against the same data
> directory *will* result in unrecoverable data corruption.
>
> If you're trying to claim you've seen a false-positive situation, it
> would be interesting to hear actual details.
>
Hello, I work with Seb, and have been investigating this deeper.
It does in fact appear that we are getting false-positives.
When trying to start PG using pg_ctl, I am getting this response:
pg_ctl: another server might be running; trying to start server anyway
2012-08-26 04:46:02.211 GMT [] - FATAL: lock file "postmaster.pid" already
exists
2012-08-26 04:46:02.211 GMT [] - HINT: Is another postmaster (PID 8574)
running in data directory "/Users/mclark/Library/Application
Support/com.marketcircle.Daylite4/StorageDebug.dlpdb/Data/9_1"?
pg_ctl: this data directory appears to be running a pre-existing postmaster
pg_ctl: could not start server
Examine the log output.
PID 8574 is actually iTunes, not PG, and PG was cleanly brought down on
it's last run, there are no children processes running.
Seb figured out how to contrive this situation.
Run PG, copy the pid file, stop pg, copy the copied pid file back to the
data dir and edit it, replacing the old PID with that of another running
process.
At first we thought our software was to blame, because it checks the PID
from PG's pid file to see if a process is running with that PID, and if
none are found then we call pg_ctl, otherwise we just continue launching
our software and trying to connect to PG.
I just added an additional check to see if the process name for the PID is
postgres, and if not then try to start PG with pg_ctl, thinking it would
figure it out and remove the pid file as it would if there was no process
running with that pid.
Is this considered a bug? Should PG do a similar check on the process
name, or has the way we contrived this doing something unexpected?
Thanks,
Michael.
From | Date | Subject | |
---|---|---|---|
Next Message | John R Pierce | 2012-08-26 05:09:15 | Re: postmaster.pid file auto-clean up? |
Previous Message | Vincent Veyron | 2012-08-26 04:11:23 | Re: create table like . . . constraint names |