Re: Reliable WAL file shipping over unreliable network

From: Nagy László Zsolt <gandalf(at)shopzeus(dot)com>
To: Rui DeSousa <rui(dot)desousa(at)icloud(dot)com>, scott ribe <scott_ribe(at)elevated-dev(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Reliable WAL file shipping over unreliable network
Date: 2018-02-28 19:03:43
Message-ID: f1629420-0d42-0ef9-fbca-a9cfca5e7a01@shopzeus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin


>>
>>
>> There seem to be 2 fundamental misunderstandings here:
>>
>> 1) That other processes cannot see data written to a file until it is
>> flushed to disk; this is not true; while file data is still in file
>> cache, it is visible to other processes.
>>
>> 2) That rsync writes the file on the destination directly; it does
>> not; it writes into a temporary file and renames that file when it is
>> complete.
>>
>
> While you’re correct; I never made either of those assumptions 
>
Number 1 is irrelevant because files are sent from the master and
received by the slave. Two different computers that don't share the same
cache. Number 2 blows away my obscurity completely, I just didn't know
that rsync works that way. It must be documented somewhere in the (very
long) rsync manual. I just opened it and I looked specifically for this
behaviour, but I cannot find it. There is a --temp-dir option that
suggests that data is written to temporary files first. But that is only
a suggestion. I don't see anything explicit about writting data to
temporary files and renaming them once they are complete. Although it
seems logical and I believe you, but I did not want to make such
assumptions either.

So, problem solved in theory. The next step is to do tests by simulating
network outages.

I also have a proposal: let's change the example in the PostgreSQL
documentation! The example archive_command presented in the docs
contains a simple cp command that does not write data to temporary
files. Anyone who tries to use that example on a production server will
shoot himself on the foot. At least we should have a note there, telling
that it is the user's responsibility to make sure that only complete WAL
files appear on the slave's archive directory, and that the cp command
is just a silly example that should never be used on a production server.

Thank you for your help!

   Laszlo

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message scott ribe 2018-02-28 19:11:15 Re: Reliable WAL file shipping over unreliable network
Previous Message Dianne Skoll 2018-02-28 18:52:20 Re: Reliable WAL file shipping over unreliable network