Re: I don't want to back up index files

From: Glen Parker <glenebob(at)nwlink(dot)com>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: I don't want to back up index files
Date: 2009-03-12 23:53:53
Message-ID: 49B9A091.6020403@nwlink.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:
> Glen Parker <glenebob(at)nwlink(dot)com> writes:
> Mainly because the idea doesn't seem to make sense unless that's part
> of the package. If you don't cut index changes out of the WAL load
> then the savings on the base backup alone aren't going to be all that
> exciting when you consider the total cost of PITR backup.

In our setting, I think it might be more exciting than you think. As I
said, I've not noticed any real impact to the system related to WAL
exporting, but the nightly backup does indeed have a significant impact
because of how long it runs. WAL export is a couple seconds ever few
minutes, which nobody ever notices. The backup runs for a minimum of an
hour and fifteen minutes, which people definitely notice.

> Furthermore, you would need some very ugly hacks on the recovery process
> to make it ignore (rather than try to apply) WAL records relating to
> indexes. I believe there are a fair number of cases where the recovery
> process doesn't even know that a particular file is an index, because
> the WAL stream doesn't tell it. The live backends generating the WAL
> log entries typically know that (and could suppress the entries) but the
> recovery process has only a very limited view of reality. It cannot,
> for example, trust the system catalogs to be in a correct/consistent
> state, so it couldn't look up the info for itself.

Could the live backends label the log entries with "hints" to be used by
the replay process? In this case, I would think a simple flag
indicating whether replay is critical or not would suffice.

> BTW, there's a related problem with the idea, which is that the
> tools normally used to take base backups haven't got any way to
> distinguish indexes from any other kind of relation.

Yes there's no doubt it would increase the complexity of the base
backup, IF a person chooses to ignore indexes. The up side is that
people who are happy with the backup as it is would have to do nothing
at all, it would just continue to work as it does now. To ignore
indexes (and only certain indexes at that), you'd have to examine the
system catalog as part of each backup. I already do that to some
extent, in order to discover all the extra tablespaces that need to be
backed up.

I guess the biggest problem I see with this is that it would have rather
a small target audience.

-Glen

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jack W 2009-03-13 00:00:39 Fwd: Question about Privileges
Previous Message Tom Lane 2009-03-12 23:28:07 Re: I don't want to back up index files