From: | "Jim C(dot) Nasby" <jim(at)nasby(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | lapham(at)jandr(dot)org, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: [GENERAL] Restart after power outage: createdb |
Date: | 2006-09-27 20:44:36 |
Message-ID: | 20060927204436.GU19827@nasby.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On Wed, Sep 27, 2006 at 04:13:34PM -0400, Tom Lane wrote:
> Jon Lapham <lapham(at)jandr(dot)org> writes in pgsql-general:
> > If I run...
> > sleep 3; echo starting; createdb bar
> > ...and power off the VM while the "createdb bar" is running.
>
> > Upon restart, about 50% of the time I can reproduce the following error
> > message:
>
> > [lapham(at)localhost ~]$ psql bar
> > psql: FATAL: database "bar" does not exist
> > [lapham(at)localhost ~]$ createdb bar
> > createdb: database creation failed: ERROR: could not create directory
> > "base/65536": File exists
>
> What apparently is happening here is that the same OID has been assigned
> to the new database both times. Even though the createdb didn't
> complete, the directory it started to build is there and so there's a
> filename collision.
>
> > So, running "createdb bar" a second time works.
>
> Yeah, because the OID counter has been advanced, and so the second
> createdb uses a nonconflicting OID.
>
> In theory this scenario should not happen, because a crash-and-restart
> is supposed to guarantee that the OID counter comes up at or beyond
> where it was before the crash.
>
> After thinking about it for awhile, I believe the problem is that
> CREATE DATABASE is breaking the "WAL rule": it's allowing a data change
> (specifically, creation of the new DB subdirectory) to hit disk without
> having guaranteed that associated WAL entries were flushed first.
> Specifically, if we generated an XLOG_NEXTOID WAL entry to record the
> consumption of an OID for the database, there isn't anything ensuring
> that record gets to disk before the mkdir occurs. (ie, the comment in
> XLogPutNextOid is correct as far as it goes, but it fails to account
> for outside-the-database effects such as creation of a directory named
> after the OID.) Hence after restart the OID counter might not get
> advanced as far as it should have been.
>
> We could fix this two different ways:
>
> 1. Put an XLogFlush into createdb() somewhere between making the
> pg_database entry and starting to create subdirectories.
>
> 2. Check for conflicting database directories while assigning the OID,
> comparable to what GetNewRelFileNode() does for table files.
>
> #2 has some appeal because it could deal with random junk in
> $PGDATA/base regardless of how the junk got there. However, to do that
> in a really bulletproof way we'd have to check all the tablespace
> directories too, and that's starting to get a tad tedious for something
> that shouldn't happen anyway.
>
> So I'm leaning to #1 as a suitably low-effort fix. Thoughts?
It'd be nice to clean things up, but I understand the reluctance to do
so. Maybe a good compromise would be to warn about files that are
present in $PGDATA but don't show up in any catalogs.
Then again, if we're doing that, we could probably just nuke 'em...
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-09-27 20:52:51 | Re: [GENERAL] Restart after power outage: createdb |
Previous Message | Ray Stell | 2006-09-27 20:27:25 | Re: How to Examine a view |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-09-27 20:52:51 | Re: [GENERAL] Restart after power outage: createdb |
Previous Message | Chris Browne | 2006-09-27 20:26:12 | Re: PostgreSQL HA questions |