Quick Links

Re: sortsupport for text

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Peter Geoghegan" <peter(at)2ndquadrant(dot)com>
Cc:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, "Greg Stark" <stark(at)mit(dot)edu>, "PG Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: sortsupport for text
Date:	2012-06-19 17:57:34
Message-ID:	4FE0773E02000025000486DD@gw.wicourts.gov
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

>> I'm pretty sure that when I was using Sybase ASE the order for
>> non-equal values was always predictable, and it behaved in the
>> manner I describe below. I'm less sure about any other product.
>
> Maybe it used a physical row identifier as a tie-breaker? Note
> that we use ItemPointers as a tie-breaker for sorting index
> tuples.
>
> I imagine that it was at least predictable among columns being
> sorted, if only because en_US.UTF-8 doesn't have any notion of
> equivalence (that is, it just so happens that there are no two
> strings that are equivalent but not bitwise equal). It would
> surely be impractical to do a comparison for the entire row, as
> that could be really expensive.

We weren't using en_US.UTF-8 collation (or any other "proper"
collation) on Sybase -- I'm not sure whether they even supported
proper collation sequences on the versions we used. I'm thinking of
when we were using their "case insensitive" sorting. I don't know
the implementation details, but the behavior was consistent with
including each character-based column twice: once in the requested
position in the ORDER BY clause but folded to a consistent case, and
again after all the columns in the ORDER BY clause in original form,
with C collation.

I wasn't aware that en_US.UTF-8 doesn't have equivalence without
equality. I guess that surprising result in my last post is just
plain inevitable with that collation then. Bummer. Is there
actually anyone who finds that to be a useful behavior? For a
collation which considered upper-case and lower-case to be
equivalent, would PostgreSQL sort as I wanted, or is it doing some
tie-break per column within equivalent values?

-Kevin

In response to

Re: sortsupport for text at 2012-06-19 17:20:15 from Peter Geoghegan

Responses

Re: sortsupport for text at 2012-06-19 18:44:35 from Peter Geoghegan
Re: sortsupport for text at 2012-06-19 18:46:42 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2012-06-19 18:23:20	Re: [RFC][PATCH] Logical Replication/BDR prototype and architecture
Previous Message	Andres Freund	2012-06-19 17:50:08	Re: Do we want a xmalloc or similar function in the Backend?