From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Wim Lewis <wiml(at)omnigroup(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: B-Tree support function number 3 (strxfrm() optimization) |
Date: | 2014-07-29 00:23:54 |
Message-ID: | CAM3SWZTy3MxgXbG6373cWh9rex=RhmAj6-kuXx03yRX5-vYKgQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 28, 2014 at 5:14 PM, Wim Lewis <wiml(at)omnigroup(dot)com> wrote:
> A quick glance at OSX's strxfrm() suggests they're using an implementation of strxfrm() from FreeBSD. You can find the source here:
>
> http://www.opensource.apple.com/source/Libc/Libc-997.90.3/string/FreeBSD/strxfrm.c
>
> (and a really quick glance at the contents of libc on OSX 10.9 reinforces this--- I don't see any calls into their CoreFoundation unicode string APIs.)
Something isn't quite accounted for, then. The FreeBSD behavior is to
append the primary weights only. That makes their returned blobs
smaller than those you'll see on Linux, but also appears to imply that
their implementation is substandard (The PostgreSQL port uses ICU on
FreeBSD for a reason, I suppose). But FreeBSD did not add extra,
redundant "header bytes" right in the primary level when I tested it,
but I'm told Mac OS X does. I guess it could be that the collations
shipped differ, but I can't think why that would be. It does seem
peculiar that the Mac OS X blobs are always printable, whereas that
isn't the case with Glibc (the only restriction like that is that
there are no NULL bytes), and the Unicode algorithm standard
specifically says that that's okay.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2014-07-29 01:14:01 | Re: Reminder: time to stand down from 8.4 maintenance |
Previous Message | Wim Lewis | 2014-07-29 00:14:18 | Re: B-Tree support function number 3 (strxfrm() optimization) |