From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Open issues for collations |
Date: | 2011-04-09 21:33:30 |
Message-ID: | 1302384810.9864.13.camel@vanquo.pezone.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On mån, 2011-03-28 at 20:02 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > On lör, 2011-03-26 at 00:36 -0400, Tom Lane wrote:
> >> * It'd sure be nice if we had some nontrivial test cases that work
> >> in encodings besides UTF8. I'm still bothered that the committed
> >> patch failed to cover single-byte-encoding cases in >>
> upper/lower/initcap.
>
> > Well, how do we want to maintain these test cases without doing too
> > much duplication? It would be easy to run a small sed script over
> > collate.linux.utf8.sql to create, say, a latin1 version out of it.
>
> I tried. The upper/lower test cases require Turkish characters that
> aren't in Latin1. I'm not sure if we can readily produce test cases
> that cover both sorting changes and case-folding changes in just one
> single-byte encoding --- anybody?
>
> One thing I noticed but didn't push to committing is that the test
> case has a largely-unnecessary assumption about how the local system's
> locale names spell "utf8". We could eliminate that by having it use
> the trimmed locale names created by initdb.
I see you went for the latter option. That works pretty well already.
I've also been playing around with separating out the "Turkish" tests
into a separate file. That would then probably get the remaining
"latin1" file passing, if we also dropped the encoding mention from this
error message:
ERROR: collation "foo" for encoding "UTF8" does not exist
I had thought hard about this in the past and didn't want to do it, but
since we are now making every effort to effectively hide collations with
the wrong encoding, this would possibly be acceptable.
I'm also seeing promising signs that we might get this test (minus
Turkish, perhaps) passing on Windows.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-04-09 21:40:27 | Teaching regex operators about collations |
Previous Message | Peter Eisentraut | 2011-04-09 21:22:23 | Re: \dO versus collations for other encodings |