On Thu, Feb 10, 2011 at 9:02 PM, Scott Ribe <scott_ribe(at)elevated-dev(dot)com> wrote:
> I know that I have at least one instance of a varchar that is not valid UTF-8, imported from a source with errors (AMA CPT files, actually) before PG's checking was as stringent as it is today. Can anybody suggest a query to find such values?
CREATE OR REPLACE FUNCTION is_utf8(text)
RETURNS bool AS $$
try:
args[0].decode('utf8')
return True
except UnicodeDecodeError:
return False
$$ LANGUAGE plpythonu STRICT;
--
marko