Re: "Bogus data in lock file" shouldn't be FATAL?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: "Bogus data in lock file" shouldn't be FATAL?
Date: 2010-08-16 15:25:09
Message-ID: AANLkTinSs558j+hhEYvb8jZ7yW-8h9UFkJ-VKkp3ysVW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 16, 2010 at 11:18 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> This complaint:
> http://archives.postgresql.org/pgsql-admin/2010-08/msg00111.php
>
> seems to suggest that this code in CreateLockFile() is not well-thought-out:
>
>                if (other_pid <= 0)
>                        elog(FATAL, "bogus data in lock file \"%s\": \"%s\"",
>                                 filename, buffer);
>
> as it means that a corrupted (empty, in this case) postmaster.pid file
> prevents the server from starting until somebody intervenes manually.
>
> I think that the original concern was that if we couldn't read valid
> data out of postmaster.pid then we couldn't be sure if there were a
> conflicting postmaster running.  But if that's the plan then
> CreateLockFile is violating it further down, where it happily skips the
> PGSharedMemoryIsInUse check if it fails to pull shmem ID numbers from
> the file.
>
> We could perhaps address that risk another way: after having written
> postmaster.pid, try to read it back to verify that it contains what we
> wrote, and abort if not.  Then, if we can't read it during startup,
> it's okay to assume there is no conflicting postmaster.

What if it was readable when written but has since become unreadable?

My basic feeling on this is that manual intervention to start the
server is really undesirable and we should try hard to avoid needing
it. That having been said, accidentally starting two postmasters at
the same time that are accessing the same data files would be several
orders of magnitude worse. We can't afford to compromise on any
interlock mechanisms that are necessary to prevent that from
happening.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitriy Igrishin 2010-08-16 15:33:33 Conflicted names of error conditions.
Previous Message Robert Haas 2010-08-16 15:21:13 Re: JSON Patch for PostgreSQL - BSON Support?