From: | Jan Urbański <wulczer(at)wulczer(dot)org> |
---|---|
To: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, pg-peter(at)alvh(dot)no-ip(dot)org |
Subject: | Re: bug in Google translate snippet |
Date: | 2009-07-03 08:08:07 |
Message-ID: | 4A4DBC67.6060205@wulczer.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>>
>> Alvaro Herrera wrote:
>>> Hi,
>>>
>>> I was having a look at this snippet:
>>> http://wiki.postgresql.org/wiki/Google_Translate
>>> and it turns out that it doesn't work if the result contains non-ASCII
>>> chars. Does anybody know how to fix it?
>>>
>>> alvherre=# select gtranslate('en', 'es', 'he');
>>> ERROR: plpython: function "gtranslate" could not create return value
>>> DETALLE: <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)
>> This looks like a python issue rather than a Postgres issue. The problem
>> is probably in python-simplejson.
>
> I think the problem happens when the PL tries to create the output
> value. Otherwise I wouldn't be able to see the value in plpy.log.
The problem is that the thing you are trying to return
(resp['responseData']['translatedText']) is a Unicode object, so you
can't just print it. The error comes from Python complaining that you
are trying to output an 8-bit character using the 'ascii' codec, that
cannot encode that.
One solution is to explicitly encode the Unicode string with some codec,
that is: ask Python to convert the Unicode object into a blob using some
serialization method, UTF-8 being a good method here. For instance
return resp['responseData']['translatedText'].encode('utf-8')
worked for me.
See also http://docs.python.org/tutorial/introduction.html#unicode-strings
Cheers,
Jan
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2009-07-03 08:21:35 | Re: 8.5 development schedule |
Previous Message | Hans-Juergen Schoenig -- PostgreSQL | 2009-07-03 08:01:08 | Re: tsvector extraction patch |