Re: pgsql: Map basebackup tablespaces using a tablespace_map file

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: hlinnaka <hlinnaka(at)iki(dot)fi>, pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Map basebackup tablespaces using a tablespace_map file
Date: 2015-05-12 17:30:51
Message-ID: CAA4eK1JuSn29N6SSrk19yoR160KrdObzgyLXyqH2QSGC+wdoRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Tue, May 12, 2015 at 9:02 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

>
> On 05/12/2015 10:33 AM, Heikki Linnakangas wrote:
>
>> On 05/12/2015 04:42 PM, Andrew Dunstan wrote:
>>
>>> +
>>> + /*
>>> + * Remove the existing symlink if any and Create the
>>> symlink
>>> + * under PGDATA. We need to use rmtree instead of rmdir
>>> as
>>> + * the link location might contain directories or files
>>> + * corresponding to the actual path. Some tar utilities
>>> do
>>> + * things that way while extracting symlinks.
>>> + */
>>> + if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))
>>> + {
>>> + if (!rmtree(linkloc,true))
>>> + ereport(ERROR,
>>> + (errcode_for_file_access(),
>>> + errmsg("could not remove directory
>>> \"%s\": %m",
>>> + linkloc)));
>>> + }
>>> + else
>>> + {
>>> + if (unlink(linkloc) < 0 && errno != ENOENT)
>>> + ereport(ERROR,
>>> + (errcode_for_file_access(),
>>> + errmsg("could not remove symbolic link
>>> \"%s\": %m",
>>> + linkloc)));
>>> + }
>>> +
>>>
>>
>> That's scary. What tar utilitiy replaces the symlink with a non-empty
>> directory?
>>
>>
IIRC, it was star utility by using --copysymlinks options.
It will actually copy the symlink data at appropriate location,
but will not maintain symlink after extraction.
I don't have a link handly for it, but can again search for
it and send you if you want to have a look.

> What if you call pg_start_backup() and take the backup with a utility that
>> follows symlinks? I wouldn't recommend using such a tool, but with this
>> patch, it will zap all the tablespaces. Before this, you at least got a
>> working database and could read out all the data or fix the symlinks
>> afterwards.
>>
>>

Yeah, but I don't think user will do such a thing without
being aware of the same and if he is aware, he will either
restore the symlinks before starting the server or would
atleast keep a copy of such a backup until he is able to
restore the database completely.

Do you think adding a note in docs makes sense?

> Why is the RelationCacheInitFileRemove() call moved?
>>
>
> I will let Amit comment on this.
>
>
Because it assumes that tablespace directory pg_tblspc is all in
place and it tries to remove the files by reading pg_tblspc
directory as well. Now as we setup the symlinks in pg_tblspc
after reading tablespace_map file, so we should remove relcache init
file once the symlinks are setup in pg_tblspc directory.

>
>> XLogRecPtr
>>> do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID
>>> *starttli_p,
>>> - char **labelfile)
>>> + char **labelfile, DIR *tblspcdir, List **tablespaces,
>>> + char **tblspcmapfile, bool infotbssize,
>>> + bool needtblspcmapfile)
>>>
>>
>> Why does this need to take tblspcdir as argument?
>
>
It needs to be called from perform_base_backup() which already
has access to tblspcdir.

> Wouldn't it make more sense to open the directory inside
>> do_pg_start_backup?
>>
>
>
>
Good point. It does look cleaner to do that.
>
>
If you prefer that way, I can look into rearranging the
code.

>
>
>> In the BASE_BACKUP command, is there any reason to not always include the
>> tablespace map?
>
>
The main reason is that it is not required to restore symlinks
for plain format. Also I think it can cause a problem if user
has used --tablespace-mapping option in plain format as the
tablespace_map file doesn't contain information for same and recovery
will try to restore symlinks for wrong path based on tablespace_map
file.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2015-05-12 18:10:33 pgsql: Add support for doing late row locking in FDWs.
Previous Message Alvaro Herrera 2015-05-12 17:21:07 Re: [COMMITTERS] pgsql: pg_basebackup -F t now succeeds with a long symlink target

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-05-12 17:36:05 Re: Multi-xacts and our process problem
Previous Message Alvaro Herrera 2015-05-12 17:21:07 Re: [COMMITTERS] pgsql: pg_basebackup -F t now succeeds with a long symlink target