Curious sorting puzzle

From: Ivan Voras <ivoras(at)fer(dot)hr>
To: pgsql-performance(at)postgresql(dot)org
Subject: Curious sorting puzzle
Date: 2006-06-07 19:26:18
Message-ID: 4487285A.5050104@fer.hr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The situation is this: we're using a varchar column to store
alphanumeric codes which are by themselves 7-bit clean. But we are
operating under a locale which has its own special collation rules, and
is also utf-8 encoded. Recently we've discovered a serious "d'oh!"-type
bug which we tracked down to the fact that when we sort by this column
the collation respects locale sorting rules, which is messing up other
parts of the application.

The question is: what is the most efficient way to solve this problem
(the required operation is to sort data using binary "collation" - i.e.
compare byte by byte)? Since this field gets queried a lot it must have
an index. Some of the possible solutions we thought of are: replacing
the varchar type with numeric and do magical transcoding (bad, needs
changes thoughout the application) and inserting spaces after every
character (not as bad, but still requires modifying both the application
and the data). An ideal solution would be to have a
"not-locale-affected-varchar" field type :)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Antoine 2006-06-07 20:52:26 Re: vacuuming problems continued
Previous Message Jim C. Nasby 2006-06-07 14:50:20 Re: Regarding ALTER Command