Re: text column indexing in UTF-8 database

From: Steve Atkins <steve(at)blighty(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: text column indexing in UTF-8 database
Date: 2009-03-13 00:32:59
Message-ID: 4875B492-F3CF-425A-AE4A-6AE9CDB86125@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Mar 12, 2009, at 5:15 PM, Reece Hart wrote:

> Do I really need 4 indexes per column to handle the 4 combinations of
> {equality,like/regexp} x {unfolded,casefolded} in a UTF-8 encoded
> database?
>
> I have a column that I'd like to be able to search with equality and
> regexp (or like), optionally casefolded. The database is UTF-8
> encoded.
> The table and index defs are below.
>
> Jeff Davis gave me a tip to use text_pattern_ops on indexes to speed
> up
> regexp and like; that worked beautiful. But I discovered a caveat that
> t_p_o apparently doesn't handle equality. Thus, I think I need
> distinct
> indexes for the 4 cases above. Right?

If A=B then lower(A) = lower(B), and if A like B then lower(A) like
lower(B).

So, if nothing else, you could rewrite "where alias = 'Foo'" as
"where lower(alias) = lower('Foo') and alias='Foo'" and take advantage
of the lower() functional index.

I've read that 8.4 will be able to use a text_pattern_ops index for
equality.

Cheers,
Steve

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Davis 2009-03-13 01:02:43 Re: text column indexing in UTF-8 database
Previous Message Adrian Klaver 2009-03-13 00:23:05 Re: Fwd: Question about Privileges