I don't want to back up index files

From: Glen Parker <gparker(at)servicepaper(dot)com>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: I don't want to back up index files
Date: 2009-03-11 01:42:53
Message-ID: 49B7171D.2010508@servicepaper.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am wondering the feasibility of having PG continue to work even if
non-essential indexes are gone or corrupt. I brought this basic concept
up at some point in the past, but now I have a different motivation, so
I want to strike up discussion about it again. This time around, I
simply don't want to back up indexes if I don't have to. Because
indexes contain essentially redundant data, losing one does not equate
to losing real data. Therefore, backing them up represents a lot of
overhead for very little benefit.

Here's the basic idea:

1) New field to pg_index (indvalid boolean).
2) Query planner skips indexes where indvalid = false.
3) Executer does not update indexes where indvalid = false.
4) Executer refuses insert or update to unique columns where indvalid =
false, throwing an error.
5) WAL roll forward marks indvalid = false if index file(s) are missing,
rather than panicking.
6) REINDEX recognizes syntax to only build indexes with indvalid =
false, marks indvalid = true.

Close to 25% of the on disk bulk of my database is index files. It
would save a significant amount of the system resources used during the
backup, if I didn't have to archive the index files. In the unlikely
event that a restore/roll forward becomes necessary, I could simply
issue something like "REINDEX DATABASE foo INVALID;" to restore all the
missing indexes and return the database to full function. Prior to a
reindex, the database would perform poorly and refuse to do certain
inserts and updates, but the data would be available. Backup files
would be smaller, and the restore/roll forward would be faster.

No down sides jump out at me, and it seems to me that for a regular PG
code hacker this could actually be fairly simple to implement.

Any chance of something like this being done in the future?

-Glen

Browse pgsql-general by date

  From Date Subject
Next Message Glen Parker 2009-03-11 01:54:30 I don't want to back up index files
Previous Message Adrian Klaver 2009-03-11 00:41:20 Re: Enable user access from remote host