Re: [PATCH]make pg_rewind to not copy useless WAL files

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: chenhj <chjischj(at)163(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH]make pg_rewind to not copy useless WAL files
Date: 2017-09-29 11:29:40
Message-ID: CAPpHfdvxXrhW8KQa6sn56uvjToH+mJLGTNXDozO-QGkzsPFyWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 29, 2017 at 10:07 AM, chenhj <chjischj(at)163(dot)com> wrote:

> On 2017-09-29 05:31:51, "Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru>
> wrote:
>
> On Thu, Sep 28, 2017 at 10:52 PM, chenhj <chjischj(at)163(dot)com> wrote:
>
>> On 2017-09-29 00:43:18,"Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru>
>> wrote:
>>
>> On Thu, Sep 28, 2017 at 6:44 PM, chenhj <chjischj(at)163(dot)com> wrote:
>>
>>> On 2017-09-28 01:29:29,"Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru>
>>> wrote:
>>>
>>> It appears that your patch conflicts with fc49e24f. Please, rebase it.
>>>
>>>
>>> Yes, i had rebased it, Please check the new patch.
>>>
>>
>> Good, now it applies cleanly.
>>
>> else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
>>> IsXLogFileName(path + strlen(XLOGDIR"/")) &&
>>> (strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0
>>> ||
>>> strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) >
>>> 0))
>>
>>
>> According to our conding style, you should leave a space betwen XLOGDIF
>> and "/".
>> Also, you do a trick by comparison xlog segment numbers using strcmp().
>> It's nice, but I would prefer seeing XLogFromFileName() here. It would
>> improve code readability and be less error prone during further
>> modifications.
>>
>>
>> Thanks for advice!
>> I had modified it.
>>
>
> OK. Patch becomes better.
> I also have more general question. Why do we need upper bound for segment
> number (last_source_segno)? I understand the purpose of lower bound
> (divergence_segno) which save us from copying extra WAL files, but what is
> upper bound for? As I understood, we anyway need to replay most recent WAL
> records to reach consistent state after pg_rewind. I propose to
> remove last_source_segno unless I'm missing something.
>
>
> Thanks for relay!
> When checkpoint occurs, some old WAL files will be renamed as future WAL
> files for later use.
> The upper bound for segment number (last_source_segno) is used to avoid
> copying these extra WAL files.
>
> When the parameter max_wal_size or max_min_size is large,these may be many
> renamed old WAL files for reused.
>
> For example, I have just looked at one of our production systems
> (max_wal_size = 64GB, min_wal_size = 2GB),
> the total size of WALs is about 30GB, and contains about 4GB renamed old
> WAL files.
>
> [postgres(at)hostxxx pg_xlog]$ ll
> ...
> -rw------- 1 postgres postgres 16777216 Sep 29 14:05
> 0000000100000BCF00000078
> -rw------- 1 postgres postgres 16777216 Sep 29 14:05
> 0000000100000BCF00000079
> -rw------- 1 postgres postgres 16777216 Sep 29 14:05
> 0000000100000BCF0000007A
> -rw------- 1 postgres postgres 16777216 Sep 29 14:05
> 0000000100000BCF0000007B
> -rw------- 1 postgres postgres 16777216 Sep 29 14:05
> 0000000100000BCF0000007C
> -rw------- 1 postgres postgres 16777216 Sep 29 14:05
> 0000000100000BCF0000007D
> -rw------- 1 postgres postgres 16777216 Sep 29 11:22
> 0000000100000BCF0000007E //after this, there are about 4GB WALs for reuse
> -rw------- 1 postgres postgres 16777216 Sep 29 11:08
> 0000000100000BCF0000007F
> -rw------- 1 postgres postgres 16777216 Sep 29 11:06
> 0000000100000BCF00000080
> -rw------- 1 postgres postgres 16777216 Sep 29 12:05
> 0000000100000BCF00000081
> -rw------- 1 postgres postgres 16777216 Sep 29 11:28
> 0000000100000BCF00000082
> -rw------- 1 postgres postgres 16777216 Sep 29 11:06
> 0000000100000BCF00000083
> ...
>

OK. That makes sense. Thank you for the explanation.

I still have some minor comments.

> /*
> + * Save the WAL filenames of the divergence and the current WAL insert
> + * location of the source server. Later only the WAL files between
> those
> + * would be copied to the target data directory.
>

Comment is outdated. We don't save filenames anymore, now we save segment
numbers.

> + * Note:The later generated WAL files in the source server before the
> end
> + * of the copy of the data files must be made available when the target
> + * server is started. This can be done by configuring the target
> server as
> + * a standby of the source server.
> + */

You miss space after "Note:". Also, it seems reasonable for me to leave
empty line before "Note:".

# Setup parameter for WAL reclaim

Parameter*s*, because you're setting up multiple of them.

# The accuracy of imodification from pg_ls_waldir() is seconds, so sleep
> one seconds

One second without "s".

Also, please check empty lines in 006_wal_copy.pl to be just empty lines
without tabs.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-09-29 11:48:15 Re: Parallel Append implementation
Previous Message Pavel Stehule 2017-09-29 10:15:22 Re: SQL/JSON in PostgreSQL