From: | Claudio Freire <klaussfreire(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Gabriele Bartolini <gabriele(dot)bartolini(at)2ndquadrant(dot)it>, desmodemone <desmodemone(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: Incremental Backup |
Date: | 2014-08-11 17:00:44 |
Message-ID: | CAGTBQpb83nAA+hwbHG+jW7OEdZQUCHErzZUn=qrhMNHe+vMRYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 11, 2014 at 12:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> As Marco says, that can be optimized using filesystem timestamps instead.
>
> The idea of using filesystem timestamps gives me the creeps. Those
> aren't always very granular, and I don't know that (for example) they
> are crash-safe. Does every filesystem on every platform make sure
> that the mtime update hits the disk before the data? What about clock
> changes made manually by users, or automatically by ntpd? I recognize
> that there are people doing this today, because it's what we have, and
> it must not suck too much, because people are still doing it ... but I
> worry that if we do it this way, we'll end up with people saying
> "PostgreSQL corrupted my data" and will have no way of tracking the
> problem back to the filesystem or system clock event that was the true
> cause of the problem, so they'll just blame the database.
I have the same creeps. I only do it on a live system, after a first
full rsync, where mtime persistence is not an issue, and where I know
ntp updates have not happened.
I had a problem once where a differential rsync with timestamps didn't
work as expected, and corrupted a slave. It was a test system so I
didn't care much at the time, but if it were a backup, I'd be quite
pissed.
Basically, mtimes aren't trustworthy across reboots. Granted, this was
a very old system, debian 5 when it was new, IIRC, so it may be better
now. But it does illustrate just how bad things can get when one
trusts timestamps. This case was an old out-of-sync slave on a test
set up that got de-synchronized, and I tried to re-synchronize it with
a delta rsync to avoid the hours it would take to actually compare
everything (about a day). One segment that was modified after the sync
loss was not transfered, causing trouble at the slave, so I was forced
to re-synchronize with a full rsync (delta, but without timestamps).
This was either before pg_basebackup or before I heard of it ;-), but
in any case, if it happened on a test system with little activity, you
can be certain it can happen on a production system.
So I now only trust mtime when there has been neither a reboot nor an
ntpd running since the last mtime-less rsync. On those cases, the
optimization works and helps a lot. But I doubt you'll take many
incremental backups matching those conditions.
Say what you will of anecdotal evidence, but the issue is quite clear
theoretically as well: modifications to file segments that aren't
reflected within mtime granularity. There are many reasons why mtime
could lose precision. Being an old filesystem with second-precision
timestamps is just one, but not the only one.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-08-11 17:34:55 | Re: replication commands and log_statements |
Previous Message | Robert Haas | 2014-08-11 16:43:09 | Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations |