From: | Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Christoph Berg <myon(at)debian(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: fsync-pgdata-on-recovery tries to write to more files than previously |
Date: | 2015-05-27 06:16:39 |
Message-ID: | 20150527061639.GA31904@toroid.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At 2015-05-26 22:44:03 +0200, andres(at)anarazel(dot)de wrote:
>
> So what I propose is:
> 1) Remove the automatic symlink following
> 2) Follow pg_tbspc/*, pg_xlog if it's a symlink, fix the latter in
> initdb -S
> 3) Add a elevel argument to walkdir(), return if AllocateDir() fails,
> continue for stat() failures in the readdir() loop.
> 4) Add elevel argument to pre_sync_fname, fsync_fname, return after
> errors.
> 5) Accept EACCESS, ETXTBSY (if defined) when open()ing the files. By
> virtue of not following symlinks we should not need to worry about
> EROFS
Here's a WIP patch for discussion.
I've (a) removed the S_ISLNK() branch in walkdir, (b) reintroduced
walktblspc_links to call walkdir on each of the entries within pg_tblspc
(simpler than trying to make walkdir follow links only for pg_xlog and
under pg_tblspc), (c) call walkdir on pg_xlog if it's a symlink (not
done for initdb -S; will submit separately), (d) add elevel arguments as
described, (e) ignore EACCES and ETXTBSY.
This correctly fsync()s stuff according to strace, and doesn't die if
there are unreadable files/links in PGDATA.
What I haven't done is return if AllocateDir() fails. I'm not convinced
that's correct, because it'll not complain if PGDATA is unreadable (but
this will break other things, so it doesn't matter), but also will die
if readdir fails rather than opendir.
I'm trying a couple of approaches to that (e.g. using readdir directly
instead of ReadDir), but other suggestions are welcome.
-- Abhijit
Attachment | Content-Type | Size |
---|---|---|
fsync-wip.patch | text/x-diff | 8.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Abhijit Menon-Sen | 2015-05-27 06:43:29 | Re: fsync-pgdata-on-recovery tries to write to more files than previously |
Previous Message | Michael Paquier | 2015-05-27 03:39:44 | Re: why does txid_current() assign new transaction-id? |