EOL characters and multibyte encodings

From: Joe Conway <mail(at)joeconway(dot)com>
To: "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject: EOL characters and multibyte encodings
Date: 2007-06-21 22:27:17
Message-ID: 467AFB45.4010102@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I finally was able PL/R to compile and run on Windows recently. This has
lead to people using a Windows based client (typically PgAdmin III) to
create PL/R functions. Immediately I started to receive reports of
failures that turned out to be due to the carriage return (\r) used in
standard Win32 EOLs (\r\n). It seems that the R parser only accepts
newlines (\n), even on Win32 (confirmed on r-devel list with a core
developer).

My first thought on fixing this issue was to simply replace all
instances of '\r' in pg_proc.prosrc with '\n' prior to sending it to the
R parser. As far as I know, any instances of '\r' embedded in a
syntactically valid R statement must be escaped (i.e. literally the
characters "\" and "r"), so that should not be a problem. But I am
concerned about how this potentially plays against multibyte characters.
Is it safe to do this, or do I need to use a mb-aware replace algorithm?

Thanks,

Joe

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-06-21 22:38:46 Re: EOL characters and multibyte encodings
Previous Message Tom Lane 2007-06-21 22:15:51 Worries about delayed-commit semantics