From: | daveg <daveg(at)sonic(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Alex Hunsaker <badalex(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Patches <pgsql-patches(at)postgresql(dot)org>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [HACKERS] Patch for Prevent pg_dump/pg_restore from being affected by statement_timeout |
Date: | 2008-06-25 03:57:19 |
Message-ID: | 20080625035719.GG12245@sonic.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Tue, Jun 24, 2008 at 10:41:07PM -0400, Tom Lane wrote:
> daveg <daveg(at)sonic(dot)net> writes:
> > Are we talking about the same patch?
>
> Maybe not --- I thought you were talking about a backend-side behavioral
> change.
>
> > Because I don't know what you are
> > refering to with "timer management code" and "scheduling the interrupt" in
> > the context of pg_dump.
>
> I'm not sure that I see a good argument for making pg_dump deliberately
> fail, if that's what you're proposing. Maybe I'm just too old-school,
> but there simply is not any other higher priority for a database than
> safeguarding your data.
We agree about that. The intent of my patch it to give the user a chance to
take corrective action in a case where pg_dump cannot be relied on to succeed.
The problem is that pg_dump can get blocked behind locks and then fail hours
later when the locks are released because some table it had not locked yet
changed. In the worst case:
- no backup,
- no notice until too late to restart the backup,
- lost production due to other processes waiting on locks pg_dump holds.
So the intent of the patch is to optionally allow pg_dump to fail quickly
if it cannot get all the access share locks it needs. This gives the user
an opportunity to notice and retry in a timely way.
Please see http://archives.postgresql.org/pgsql-patches/2008-05/msg00351.php
for the orginal patch and problem description.
A sample failure instance from a very heavy batch environment with a lot of
materialized views being maintained concurrently with pg_dump. DB size
is about 300 GB:
---
20080410 14:53:49 dumpdb c04_20080410_public: dumping c04 to /backups/c04_20080410_public
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: cache lookup failed for index 22619852
pg_dump: The command was: SELECT t.tableoid, t.oid, t.relname as indexname, pg_catalog.pg_get_indexdef(i.indexrelid) as indexdef, t.relnatts as indnkeys, i.indkey, i.indisclustered, c.contype, c.conname, c.tableoid as contableoid, c.oid as conoid, (SELECT spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) as tablespace, array_to_string(t.reloptions, ', ') as options FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = i.indexrelid) LEFT JOIN pg_catalog.pg_depend d ON (d.classid = t.tableoid AND d.objid = t.oid AND d.deptype = 'i') LEFT JOIN pg_catalog.pg_constraint c ON (d.refclassid = c.tableoid AND d.refobjid = c.oid) WHERE i.indrelid = '22615005'::pg_catalog.oid ORDER BY indexname
20080411 06:12:17 dumpdb FATAL: c04_20080410_public: dump failed
---
Note that the dump started at 14:53, but did not fail until 06:12 the next day,
and it never got to the actual copy out phase. Meanwhile other DDL using
processes were hung on the access share locks aready held by pg_dump.
Regards
-dg
--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Hammond | 2008-06-25 07:40:35 | the un-vacuumable table |
Previous Message | Tom Lane | 2008-06-25 03:03:16 | Re: proposal for smaller indexes on index-ordered tables |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2008-06-25 08:19:06 | Re: variadic function support |
Previous Message | Tom Lane | 2008-06-25 02:49:09 | Re: variadic function support |