Re: BUG #7913: TO_CHAR Function & Turkish collate

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: a_dursun(at)hotmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7913: TO_CHAR Function & Turkish collate
Date: 2013-03-03 15:42:59
Message-ID: 16008.1362325379@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

a_dursun(at)hotmail(dot)com writes:
> prod=# SELECT TO_CHAR('2013-03-01'::date,'DAY');
> to_char
> ----------
> FRDAY
> (1 row)
> But it must return as FRIDAY.
> Our database lc_collate is tr_TR.UTF-8 and encoding is UTF8.

It looks like the cause of this is that the result is computed as
str_toupper("Friday"), and str_toupper() applies a collation-sensitive
upcasing rule.

I think the use of str_toupper() is appropriate when processing the
locale-specific string for a TMDAY specification; but plain DAY is not
supposed to be locale-dependent, so we probably should use an ASCII-only
upcasing rule in the non-TM code path.

Anybody have an opinion on whether to back-patch such a fix? It seems
conceivable that somebody out there is relying on the current behavior.
OTOH, I believe that only Turkish UTF8 locales exhibit this behavior
(the single-byte-encoding code path in str_toupper acts differently for
historical reasons). So it's pretty inconsistent as it stands.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Eisentraut 2013-03-03 18:28:46 Re: postmaster --help does not show --config-file
Previous Message Jeff Janes 2013-03-02 19:19:05 Re: BUG #7853: Incorrect statistics in table with many dead rows.