From: | Oleg Bartunov <obartunov(at)gmail(dot)com> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Dealing with collation and strcoll/strxfrm/etc |
Date: | 2016-03-29 08:54:24 |
Message-ID: | CAF4Au4yjYVMDdNtMAAr4e=Ut-JAjocFkNW3DA0KJFNAbx6ky2w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 28, 2016 at 5:57 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> All,
>
> Changed the thread name (we're no longer talking about release
> notes...).
>
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > Oleg Bartunov <obartunov(at)gmail(dot)com> writes:
> > > Should we start thinking about ICU ?
> >
> > Isn't it still true that ICU fails to meet our minimum requirements?
> > That would include (a) working with the full Unicode character range
> > (not only UTF16) and (b) working with non-Unicode encodings. No doubt
> > we could deal with (b) by inserting a conversion, but that would take
> > a lot of shine off the performance numbers you mention.
> >
> > I'm also not exactly convinced by your implicit assumption that ICU is
> > bug-free.
>
> We have a wiki page about ICU. I'm not sure that it's current, but if
> it isn't and people are interested then perhaps we should update it:
>
> https://wiki.postgresql.org/wiki/Todo:ICU
>
>
Good point, I forget about this page.
> If we're going to talk about minimum requirements, I'd like to argue
> that we require whatever system we're using to have versioning (which
> glibc currently lacks, as I understand it...) to avoid the risk that
> indexes will become corrupt when whatever we're using for collation
> changes. I'm pretty sure that's already bitten us on at least some
> RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues
> with strcoll vs. strxfrm.
>
agree.
>
> Regarding key abbreviation and performance, if we are confident that
> strcoll and strxfrm are at least independently internally consistent
> then we could consider offering an option to choose between them.
> We'd need to identify what each index was built with to do so, however,
> as they would need to be rebuilt if the choice changes, at least
> until/unless they're made to reliably agree. Even using only one or the
> other doesn't address the versioning problem though, which is a problem
> for all currently released versions of PG and is just going to continue
> to be an issue.
>
Ideally, we should benchmarking all locales on all platforms for all kind
indexes. But that's big project.
>
> Thanks!
>
> Stephen
>
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2016-03-29 08:56:16 | Re: Draft release notes for next week's releases |
Previous Message | Dilip Kumar | 2016-03-29 08:39:13 | Re: Relation extension scalability |