Re: Very puzzling sort behavior

From: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>
To: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Very puzzling sort behavior
Date: 2015-09-10 20:49:18
Message-ID: CAD3a31XC25Nafcsjy_XCtdzUyy=rsywXCgXKJOD4+tT0A-x0-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Sep 10, 2015 at 12:56 PM, Peter Geoghegan <
peter(dot)geoghegan86(at)gmail(dot)com> wrote:

> On Thu, Sep 10, 2015 at 12:51 PM, Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com> wrote:
> > OK, can one of you help me out in understanding this? I would have
> thought that given "CLARK," and "CLARKE" that the comma would get compared
> against the E and come first. End of story, before we even get to anything
> farther in the string. What am I missing?
>
> That's only how it works with the C locale. Otherwise, there are
> complicated rules to weigh things like space and punctuation (and
> accents/diacritics) less prominently than primary alphabetical
> ordering. This is often useful. Anyway, based on what you say here, I
> think you should actually "ORDER BY name_last, name_first".
>
> --
> Regards,
> Peter Geoghegan
>

Thanks. A little more help would be appreciated. First a little context:

What I mailed out what a boiled down example. In reality, what I have is a
ton of tables with a client_id in them, and a convenience function
client_name(client_id) that returns the name_last, name_first string (plus
an alias if it exists). client_name is used all over the place in both
views and in an app that uses the database. There is a similar, also
heavily used, staff_name function. Eliminating the use of these functions
is a non-starter for me--I'd much rather live with the existing sort
behavior, especially as no one has ever pointed this out despite over a
decade of use.

I'm hoping to sort change this behavior with as minimal a change as
possible (e.g., minimal potential for unexpected side effects or
breakage). I was hoping to just add a COLLATE "C" within the function:

CREATE OR REPLACE FUNCTION client_name( cid int4 ) RETURNS text AS $$
-- client is a view that draws from tbl_client. name_full is the
field with the name_last, name_first data in it
SELECT name_full COLLATE "C" FROM client WHERE client_id=$1;
$$

but that seems to have no effect. And sure enough the documentation seems
to back that up. ("The collation assigned to a function or operator's
combined input expressions is also considered to apply to the function or
operator's result, if the function or operator delivers a result of a
collatable data type.") So this may be wishful thinking, but is there any
other way to specify the collation of a function result? Specifying the
collation every time the function is used is likely a no-go for me too.

Alternatively, it seems I could create new databases with a C collation and
then move my data into them. This seems a bit drastic, although possible.
I'd again be worried about the breakage/side effects. And actually, will
this work? (i.e., can you use pg_dump to populate a new database with a
different locale?)

Are there any other potential solutions, pitfalls or considerations that
come to mind? Any thoughts welcome. And as I said, if there's not a good
way to do this I'll probably leave it alone. Thanks.

Ken

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://agency-software.org/demo/client
<https://agency-software.org/demo/client>*
ken(dot)tanzer(at)agency-software(dot)org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request(at)lists(dot)sourceforge(dot)net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2015-09-10 21:02:37 Re: Very puzzling sort behavior
Previous Message Peter Geoghegan 2015-09-10 19:56:24 Re: Very puzzling sort behavior