Re: Case insensitive collation

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Marcel van Pinxteren <marcel(dot)van(dot)pinxteren(at)gmail(dot)com>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, Jasen Betts <jasen(at)xnet(dot)co(dot)nz>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Case insensitive collation
Date: 2013-01-21 21:45:02
Message-ID: CAOR=d=3Qd7_QdThGAYG3tQjwSUQijUrf-cwn6MLP+7uiNcNBkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jan 21, 2013 at 9:25 AM, Marcel van Pinxteren
<marcel(dot)van(dot)pinxteren(at)gmail(dot)com> wrote:
> To be honest, the reason I don't want to use citext and lower(), is me being
> lazy. If I have to use these features, there is more work for me converting
> from SQL Server to Postgresql. I have to make more changes to my database,
> and more to my software.
> But, developers are generally lazy, so you could argue that this reason is
> "compelling".
> The other reason, is that I assume that "lower()" adds overhead, so makes
> things slower than they need to be.
> Whether that is true, and if that is a compelling reason, I don't know.

Honestly as a lazy DBA I have to say it'd be pretty easy to write a
script to convert any unique text index into a unique text index with
a upper() in it. As another poster added, collation ain't free
either. I'd say you should test it to see. My experience tells me
that having an upper() (or lower()) index is not a big performance
hit. If the storage of the index would be too much due to large text
fields then make it a md5(lower()) index, which WILL cost more CPU
wise, but allow for > 3k or so of text in a column to be indexed and
cost less IO wise.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tim Uckun 2013-01-21 21:45:24 Re: Running update in chunks?
Previous Message patrick keshishian 2013-01-21 21:31:55 Re: Running update in chunks?