> shouldn't pg_dump encode the utf8 bytesequences?
at least i found out why the invalid unicode sequences appear in the first
place: tsearch2 in 8.1 doesn't properly handle utf8 characters: the
character's 2-byte representation is converted to lowercase byte for byte.
for example: "ä" which is encoded as "ä" is written to the db by tsearch2
as "ã¤" which is an invalid utf8 byte sequence.
striping the ts2 index columb before dumping fixes the encoding problems. i
guess the 8.2 -> 8.1.5 backport should fix it as well, i'll try asap.
> also, regarding pg_restore, its quite troubling it has the same
> parameter-set as pg_dump
never mind this, it is too late in the evening 8-)
- thomas