From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | MauMau <maumau307(at)gmail(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [RFC] What should we do for reliable WAL archiving? |
Date: | 2014-03-22 09:21:06 |
Message-ID: | 20140322092105.GA12234@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Mar 22, 2014 at 06:22:37AM +0900, MauMau wrote:
> From: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
> >Do people really just copy the files from one directory of local
> >storage to
> >another directory of local storage? I don't see the point of that.
>
> It makes sense to archive WAL to a directory of local storage for
> media recovery. Here, the local storage is a different disk drive
> which is directly attached to the database server or directly
> connected through SAN.
I'm one of those peope. They are archived into a local directory in
preparation for an rsync over ssh.
> >The recommendation is to refuse to overwrite an existing file of the same
> >name, and exit with failure. Which essentially brings archiving
> >to a halt,
> >because it keeps trying but it will keep failing. If we make a custom
> >version, one thing it should do is determine if the existing archived file
> >is just a truncated version of the attempting-to-be archived file, and if
> >so overwrite it. Because if the first archival command fails with a
> >network glitch, it can leave behind a partial file.
>
> What I'm trying to address is just an alternative to cp/copy which
> fsyncs a file. It just overwrites an existing file.
I ran into a related problem with cp, where halfway the copy the disk
was full and I was left with half a WAL file. This caused the rsync to
copy only half a file and the replication broke. This is clearly a
recoverable situation, but it didn't recover in this case.
> Yes, you're right, the failed archive attempt leaves behind a
> partial file which causes subsequent attempts to fail, if you follow
> the PG manual. That's another undesirable point in the current doc.
> To overcome this, someone on this ML recommended me to do "cp %p
> /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f".
> Does this solve your problem?
This would probably have handled it, but I find it odd that there's
program to handle restoring of archives properly, but on the archiving
side you have to cobble together your own shell scripts which fail in
various corner cases.
I'd love a program that just Did The Right Thing.
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer
From | Date | Subject | |
---|---|---|---|
Next Message | Thom Brown | 2014-03-22 10:45:35 | Re: Partial index locks |
Previous Message | Piotr Stefaniak | 2014-03-22 09:07:56 | Re: Review: plpgsql.extra_warnings, plpgsql.extra_errors |