From: | "Mikel Lindsaar" <raasdnil(at)gmail(dot)com> |
---|---|
To: | "Tore Halset" <halset(at)pvv(dot)ntnu(dot)no>, pgsql-admin(at)postgresql(dot)org |
Subject: | Re: unable to restore 8.2.5 |
Date: | 2008-09-19 13:37:03 |
Message-ID: | 57a815bf0809190637h49abd565x108ce52bc92a9d0a@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Fri, Sep 19, 2008 at 6:29 PM, Tore Halset <halset(at)pvv(dot)ntnu(dot)no> wrote:
> Looks like I have managed to insert an illegal character into the main
> system that does not conform to UTF-8. Anything I can and should do to work
> around this issue?
I have had the same problem previously and after a lot of help from
Tom Lane basically came up to the following...
You need to basically dump your table out (or a subset containing the
row ID and column that would have the bad data) in plain text and then
parse it with a script to detect invalid UTF-8 sequences, then find
what rows the bad data is in and go and fix it.
It is either that or you drop the data inserting some other character.
But this has obvious drawbacks.
I wrote a short ruby script that goes through a dumped file line by
line and puts each line through Iconv to parse it from UTF-8 to UTF-8,
if it fails it dumps the offending line to a log file.
A ruby script that would just print the offending row would go
something like this:
require 'iconv'
File.read(ARGV[0]).each do |line|
begin
Iconv.iconv('UTF-8', 'UTF-8', line)
rescue
puts "Failed: #{line}"
end
end
Save that in a file (find_invalid_utf8.rb) then run it with:
$ ruby find_invalid_utf8.rb my_dumped_table.csv
It's not pretty, and just dumps the raw output to the screen, but it
might do for you.
--
http://lindsaar.net/
Rails, RSpec and Life blog....
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2008-09-19 13:52:18 | Re: Multi-processors |
Previous Message | c k | 2008-09-19 11:55:23 | Multi-processors |