From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_rewind failure by file deletion in source server |
Date: | 2015-06-23 16:40:45 |
Message-ID: | CAHGQGwGTsV6-txUo3rd7GaLAHn9C0N2WBK9g9J5zJ11_JJV9Hg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jun 23, 2015 at 11:21 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 06/23/2015 05:03 PM, Fujii Masao wrote:
>>
>> On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
>> wrote:
>>>
>>> On 06/23/2015 07:51 AM, Michael Paquier wrote:
>>>>
>>>>
>>>> So... Attached are a set of patches dedicated at fixing this issue:
>>>
>>>
>>>
>>> Thanks for working on this!
>>>
>>>> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if
>>>> path does not exist
>>>> - 0002, same with pg_stat_file, returning NULL if file does not exist
>>>> - 0003, same with pg_read_*file. I added them to all the existing
>>>> functions for consistency.
>>>> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs
>>>> (thanks Robert for the naming!)
>>>> - 0005, as things get complex, a set of regression tests aimed to
>>>> covering those things. pg_tablespace_location is platform-dependent,
>>>> so there are no tests for it.
>>>> - 0006, the fix for pg_rewind, using what has been implemented before.
>>>
>>>
>>>
>>> With thes patches, pg_read_file() will return NULL for any failure to
>>> open
>>> the file, which makes pg_rewind to assume that the file doesn't exist in
>>> the
>>> source server, and will remove the file from the destination. That's
>>> dangerous, those functions should check specifically for ENOENT.
>>
>>
>> I'm wondering if using pg_read_file() to copy the file from source server
>> is reasonable. ISTM that it has two problems as follows.
>>
>> 1. It cannot read very large file like 1GB file. So if such large file was
>> created in source server after failover, pg_rewind would not be able
>> to copy the file. No?
>
>
> pg_read_binary_file() handles large files just fine. It cannot return more
> than 1GB in one call, but you can call it several times and retrieve the
> file in chunks. That's what pg_rewind does, except for reading the control
> file, which is known to be small.
Yeah, you're right.
I found that pg_rewind creates a temporary table to fetch the file in chunks.
This would prevent pg_rewind from using the *hot standby* server as a source
server at all. This is of course a limitation of pg_rewind, but we might want
to alleviate it in the future.
>> 2. Many users may not allow a remote client to connect to the
>> PostgreSQL server as a superuser for some security reasons. IOW,
>> there would be no entry in pg_hba.conf for such connection.
>> In this case, pg_rewind always fails because pg_read_file() needs
>> superuser privilege. No?
>>
>> I'm tempting to implement the replication command version of
>> pg_read_file(). That is, it reads and sends the data like BASE_BACKUP
>> replication command does...
>
>
> Yeah, that would definitely be nice. Peter suggested it back in January
> (http://www.postgresql.org/message-id/54AC4801.7050300@gmx.net) I think
> it's way too late to do that for 9.5, however. I'm particularly worried that
> if we design the required API in a rush, we're not going to get it right,
> and will have to change it again soon. That might be difficult in a minor
> release. Using pg_read_file() and friends is quite flexible, even though we
> just find out that they're not quite flexible enough right now (the ENOENT
> problem).
I agree that it's too late to do what I said...
But just using pg_read_file() cannot address the #2 problem that I pointed
in my previous email. Also requiring a superuer privilege on pg_rewind
really conflicts with the motivation why we added replication privilege.
So we should change pg_read_file() so that even replication user can
read the file?
Or replication user version of pg_read_file() should be implemented?
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2015-06-23 16:42:11 | Re: NULL passed as an argument to memcmp() in parse_func.c |
Previous Message | Jeff Janes | 2015-06-23 16:29:46 | btree_gin and BETWEEN |