trivial DoS on char recoding

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: trivial DoS on char recoding
Date: 2006-06-20 21:32:49
Message-ID: 20060620213249.GI26882@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Oswaldo Hernandez just reported this in the pgsql-es-ayuda list.
Basically, a conversion between UTF8 and windows_1250 can crash the
server.

I recall a bug around this general code but I don't recall it being able
to provoke a PANIC.

To reproduce, create a cluster with UTF-8 encoding and locale es_ES (I'm
actually using es_CL but it should be the same). Note that the es_ES
locale is declared to use Latin1 encoding, not UTF-8. In a psql
session,

template1=# copy foo from '/tmp/foo' ;
ERROR: no existe la relación «foo»
template1=# \encoding latin1
template1=# copy foo from '/tmp/foo' ;
ERROR: could not convert UTF8 character 0x00f3 to ISO8859-1
template1=# \encoding windows_1250
template1=# copy foo from '/tmp/foo' ;
PANIC: ERRORDATA_STACK_SIZE exceeded

Table "foo" nor the /tmp/foo file need to exist.

In the server logs, I set "log_line_prefix" to %x (Xid) to make it
obvious that these reports are in processing the same message. When the
PANIC occurs, the server logs this:

574 ERROR: no existe la relación «foo»
574 WARNING: ignorando el carácter UTF-8 no convertible 0xf36e20ab
574 WARNING: ignorando el carácter UTF-8 no convertible 0xe16374
574 WARNING: ignorando el carácter UTF-8 no convertible 0xe16374
574 WARNING: ignorando el carácter UTF-8 no convertible 0xe16374
574 PANIC: ERRORDATA_STACK_SIZE exceeded
574 SENTENCIA: copy foo from '/tmp/datoscopy' ;

To reproduce, you using a non-C locale is (es_ES works for me). If I
start the postmaster with -C lc_messages=C, the problem does not occur.
Note that the PO file for the spanish translation is written in Latin1,
not UTF8. So I can adventure that the server is trying to recode a
string which is originally in Latin1, but assuming it is UTF-8, to
Win1250.

Now, it can be argued that this is really operator error -- because I
can't crash the server if I correctly initdb with es_CL.UTF8. Should we
get firmer in rejecting invalid configurations?

I'm not sure up to what point this affects other translations, collates,
encodings -- right now I only have "es" (spanish) compiled and my system
is not configured to accept anything else.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-06-20 21:38:08 Re: UPDATE crash in HEAD and 8.1
Previous Message Joe Conway 2006-06-20 20:59:57 Re: Slightly bogus regression test for contrib/dblink