Re: postgres8.3beta encodding problem?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, marcelo Cortez <jmdc_marcelo(at)yahoo(dot)com(dot)ar>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: postgres8.3beta encodding problem?
Date: 2007-12-18 15:54:03
Message-ID: 20071218155403.GF13268@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Dec 18, 2007 at 10:35:39AM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > Ok, but that doesn't apply in this case, his database appears to be
> > LATIN1 and this character is valid for that encoding...
>
> You know what, I think the test in the code is backwards.
>
> is_mb = pg_encoding_max_length(encoding) > 1;
>
> if ((is_mb && (cvalue > 255)) || (!is_mb && (cvalue > 127)))

It does seem to be a bit wierd. For single character encodings anything
up to 255 is OK, well, sort of. It depends on what you want chr() to do
(oh no, not this discussion again). If you subscribe to the idea that
it should use unicode code points then the test is completely bogus,
since whether or not the character is valid has nothing to with whether
the encoding is multibyte or not.

If you want the output of th chr() to (logically) depend on the encoding
then the test makes more sense, but ten it's inverted. Single-byte
encodings are by definition defined to 255 characters. And multibyte
encodings (other than UTF-8 I suppose) can only see the ASCII subset.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
> -- John F Kennedy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-12-18 16:00:08 Re: postgres8.3beta encodding problem?
Previous Message Tom Lane 2007-12-18 15:35:39 Re: postgres8.3beta encodding problem?