From: | Hunter Hillegas <lists(at)lastonepicked(dot)com> |
---|---|
To: | Kris Jurka <books(at)ejurka(dot)com> |
Cc: | PostgreSQL <pgsql-general(at)postgresql(dot)org>, Postgre JDBC List <pgsql-jdbc(at)postgresql(dot)org> |
Subject: | Re: [JDBC] Invalid Character Data Problem |
Date: | 2004-11-30 06:49:23 |
Message-ID: | BDD157F3.4F01E%lists@lastonepicked.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-jdbc |
I got this going with a dump/reload.
Beware however, 8.0b5 and the JDBC driver don't seem to work with UNICODE
encoding for the database. You have to use b4 or HEAD.
Hunter
> From: Kris Jurka <books(at)ejurka(dot)com>
> Date: Tue, 30 Nov 2004 01:34:46 -0500 (EST)
> To: Hunter Hillegas <lists(at)lastonepicked(dot)com>
> Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>, Postgre JDBC List
> <pgsql-jdbc(at)postgresql(dot)org>
> Subject: Re: [JDBC] Invalid Character Data Problem
>
>
>
> On Fri, 26 Nov 2004, Hunter Hillegas wrote:
>
>> When I SELECT from a certain table, I see this JDBC exception:
>>
>> "Invalid character data was found. This is most likely caused by stored
>> data containing characters that are invalid for the character set the
>> database was created in. The most common example of this is storing 8bit
>> data in a SQL_ASCII database."
>>
>> The database is indeed of type SQL_ASCII. The table stores mailing list data
>> and has about 400,000 rows.
>>
>> Looking at the data via psql, I see that some of the rows have strange
>> characters in them, such as question marks where I would not expect them,
>> etc...
>>
>> What are my options? Is there a way to identify the 'bad' records, or the
>> ones causing trouble?
>>
>
> To really solve this problem you need to have a correctly encoded
> database. This will involve a dump and restore process and possibly
> recoding your data. This is straightforward if you know what
> encoding your data is, although it will cause some downtime.
>
> To detect the bad data you can try various SELECTs with the JDBC driver
> and see what errors out. The function below will determine if a
> particular field has data with the high bit set which is something the
> database really doesn't know what to do with.
>
> SELECT pkcolumn, hashighbit(columna), hashighbit(columnb) FROM mytable;
>
> Kris Jurka
>
> CREATE OR REPLACE FUNCTION hashighbit(text) RETURNS boolean AS '
> DECLARE
> i int;
> BEGIN
> i := LENGTH($1);
> WHILE i > 0 LOOP
> IF ascii(substring($1, i, 1)) >= 128 THEN
> RETURN true;
> END IF;
> i := i-1;
> END LOOP;
> RETURN false;
> END;
> ' LANGUAGE 'plpgsql';
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
From | Date | Subject | |
---|---|---|---|
Next Message | Johan Wehtje | 2004-11-30 06:54:24 | Column n.nsptablespace does not exist error |
Previous Message | Kris Jurka | 2004-11-30 06:34:46 | Re: [JDBC] Invalid Character Data Problem |
From | Date | Subject | |
---|---|---|---|
Next Message | Kris Jurka | 2004-11-30 08:11:10 | Re: Bug in JDBC-Driver? |
Previous Message | Kris Jurka | 2004-11-30 06:34:46 | Re: [JDBC] Invalid Character Data Problem |