Re: pg_dump generating unrestorable data (8.4)

From: Eric McKeeth <eldin00(at)gmail(dot)com>
To: Glenn Maynard <glenn(at)zewt(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump generating unrestorable data (8.4)
Date: 2011-04-06 08:32:16
Message-ID: BANLkTimsO3hHPAMbDHpPjjC=CU0WyxF=XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Apr 3, 2011 at 12:49 PM, Glenn Maynard <glenn(at)zewt(dot)org> wrote:

> After dumping a database (pg_dump -F c database > dump), trying to restore
> it (pg_restore dump) gives:
>
> > pg_restore: [archiver (db)] Error from TOC entry 2463; 0 58451 TABLE DATA
> table user
> > pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence
> for encoding "UTF8": 0xe3273a
> > HINT: This error can also happen if the byte sequence does not match the
> encoding expected by the server, which is controlled by "client_encoding".
> > CONTEXT: COPY table, line 1
>
> The surface reason for this is clear enough: invalid UTF-8 data crept into
> some tsvector columns. This is a much more serious problem, however: the
> backup tools for the database are, without warning, generating data that
> can't be restored.
>
> When in a data recovery situation, a backup that won't restore is
> catastrophic. I can't restore the database to a state it was in at the time
> of the backup; I have to spend hours of downtime figuring out what to do to
> make something usable out of my backup; and then I have to hope I've
> corrected the backup correctly before bringing the server back online. (If
> I was in an actual backup recovery situation--fortunately I'm not--I'd be
> more inclined to edit the Postgresql source to disable this check while
> restoring the backup than to risk trying to manually fix the backup data
> directly, which is very easy to get wrong.)
>

Two questions come to my mind on reading this. 1st, are you certain that the
database you're restoring to has the same encoding as the source database?
I'd have a hard time considering it an error if a dump from a database with
SQL_ASCII or some non-unicode encoding failed to restore to a UTF8 encoded
database in this manner, for example. And 2nd, does specifying the client
encoding when making the backup, with 'pg_dump -E UTF8', produce behavior
closer to what you would have expected?

-Eric

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Zeev Ben-Sender 2011-04-06 09:28:13 Re: Is index rebuilt upon updating table with the same values as already existing in the table?
Previous Message John R Pierce 2011-04-06 08:32:07 Re: Is there any provision to take incremental backup